| 
  • If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

  • You already know Dokkio is an AI-powered assistant to organize & manage your digital files & messages. Very soon, Dokkio will support Outlook as well as One Drive. Check it out today!

View
 

Solar

Page history last edited by Kieren Diment 14 years, 9 months ago

Understanding our model by examining the error.

 

This is a bit heavy and technical.  But it's worth it.

So to understand what our model's predictive capability, we can look at what it doesn't predict well.  In telling this story we can also see that the peak of the mediaevel warming period and the present day warming trend fall out of our analysis, and we can prepare their predictability from the IPCC's solar data.

First a bit of statistical theory.  If the sun is driving climate change, then we would expect the measure of solar forcing that we've used previously to be the best predictor of temperature anomaly.  In fact we already know this is not the case - CO2 predicts anomaly much better where that data is available.

But given that we don't have CO2 reading before the mid 18th century, and that they were pretty much constant before this, we can try predicting anomaly with solar across the whole data set.

The statistic that we'll use here is the residual.  That is what's left over from the observed and predicted result after we calculate the predicted value of anomaly using our model. A positive residual is where the model under-estimates the observed value, and a negative residual is where it over-estimates it.  The distribution of residuals across a model should be random - The sign and magnitude of a residual should not be predictable from observation to observation.  Therefore in time series data, a long run of positive and runs of negative residuals reflect underlying problems with the data.  Issues external to the model that are not being accounted for.  So let's get modelling:

solar.model <- lm (climate$ANOMALY ~ climate$SOLAR)

which gives the result:

Coefficients:

              Estimate Std. Error t value Pr(>|t|)   

(Intercept)   0.062312   0.004483    13.9   <2e-16 ***

climate$SOLAR 0.687792   0.022113    31.1   <2e-16 ***

---

Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.1358 on 1099 degrees of freedom

Multiple R-squared: 0.4682,    Adjusted R-squared: 0.4677

F-statistic: 967.4 on 1 and 1099 DF,  p-value: < 2.2e-16

So while solar is a significant predictor it only predicts 47% of the variance of anomaly whereas CO2 predicts 88% of the variance of anomaly.  Remember the AIC also confirms CO2 as an improvement in fit.

We can get the residuals from this with the term

solar.model$residuals

And we can standardise the residuals (residual - mean residual/standard deviation) for convenience.

Next I did some spreadsheet work and  computer programming to determine which residuals were positive or negative, and then find how many positive or negative residuals were next to each other in the dataset.

resid.stream<-read.csv("resid_stream.csv")

There are 139 times in the data set when the sign of the residual changes.  We're only interested in the longest duration of these, which means that the standardised length is greater than 3 (i.e. longer than 99% of the population of chains, assuming a normally distributed chain length).

year_stop resid length (years) mean resid  sd resid

1455      +     61             1.05        0.45

1567      +     56             0.74        0.26

1661      -     74             -1.21       0.68

1784      -     108            -0.66       0.34

2000      +     82             1.42        0.63

so we see that there were three long predods of positive residuals, 61 years up to 1455, 56 years up to 1567 and 82 years up to 2000.  We can compare the 1455 data with the 200 data, this corresponds to the peak of the mediaevel warming period, and the present day.  The R syntax that I found to do this is pretty ugly:

mwp <- subset(  subset(solar.model.resids,X >= (1455-61-900)) , X < (1454-900))$std_resid

present <- subset(  subset(solar.model.resids,X >= (2000-82-900)) , X < (2000-900))$std_resid

Then we run an independent sample t test to see if the mean residuals differ.

t.test(mwp,present)

Welch Two Sample t-test

data:  mwp and present

t = -4.1565, df = 139.889, p-value = 5.603e-05

alternative hypothesis: true difference in means is not equal to 0

95 percent confidence interval:

-0.5608582 -0.1992871

sample estimates:

mean of x mean of y

1.048565  1.428638

The statistically significant value of t  conclusively demonstrates that the relationship between solar and anomaly is of a different character during the peak of the MWP than during last 80 odd years, and this almost certainly accounts that the warming is due to different things.

You can't conclusively demonstrate this without solar irradiance data for the entire period, as that would require the solar irradiance data for the same period.  You can however see clearly that solar irradiance is not linearly related to the solar forcing data we have here by looking at the graph on this page.

 

Now for a bit of speculation backed by data ...

 

Going back to this graph (also shown on the Stage 3 page - standardised residual versus time series, bottom right ) we can see that the residuals from the complete model of the latest 150 years are generally negative in recent years - that is our co2 model is underestimating tempearture in the last 20 or so years.  Why would this be?  Random chance?  Other greenhouse gasses (methane, HFCs, NO2 etc)?  This possible trend would suggest a concern that some new warming process has started in recent years, and could be runaway global warming starting?  There's certainly no evidence for the start of any negative feedback mechanisms in this data anyway ...

 

 

I think the next step is to re-run a selection of this stuff with the NIH data.  My hypothesis is that it won't make the blindest bit of the difference to the magnitude and direction of the overall model.  That's all coming up on the Sattelite Data page

Comments (2)

kenlambert said

at 11:19 pm on Jul 30, 2009

Just looked at your page. Have not time to comment other than to say that without F.cloud albe and F.directalbedo being major cooling forcings in your analysis - and use of a parallel set of temperature data (UAH for example), I can't say if your analysis or conclusions are realistic. Getting historical cloud albedo and measures of cloudiness in the last 1000 years would seem difficult although I have read of a novel way by looking at the old Masters painted landscapes and statistically analysing the amount of painted clouds in the painted skies.

Kieren Diment said

at 9:54 pm on Aug 10, 2009

Ken: useful link for you with references: http://en.wikipedia.org/wiki/Global_warming#Solar_variation

You don't have permission to comment on this page.