Stage 3


Diagnosing the regression to see what's wrong and what's right.

 

We can use the fairly innovative GLVMA statistic from Peña and Slate (2006) to assess the model with a linear and non-linear response.  Rather unsurprisingly they both come out with some problems, most notbly that there are linearities involved.  For the model using co2:

 

 gvlma(x = my.model)

                     Value   p-value                   Decision

Global Stat        33.3035 1.035e-06 Assumptions NOT satisfied!

Skewness            2.7901 9.485e-02    Assumptions acceptable.

Kurtosis            4.6877 3.038e-02 Assumptions NOT satisfied!

Link Function      25.3560 4.767e-07 Assumptions NOT satisfied!

Heteroscedasticity  0.4698 4.931e-01    Assumptions acceptable.

And using log co2 difference:

 

Global Stat        3.065e+02 0.000e+00 Assumptions NOT satisfied!

Skewness           8.434e-03 9.268e-01    Assumptions acceptable.

Kurtosis           1.391e+02 0.000e+00 Assumptions NOT satisfied!

Link Function      9.804e+01 0.000e+00 Assumptions NOT satisfied!

Heteroscedasticity 6.931e+01 1.110e-16 Assumptions NOT satisfied!

 

Interpreting this is a bit of a black art, but it seems to me that the linear model is less invalid than the non-linear one.  I think at least in the linear model, the skewness and kurtosis are significant because of dependence on the link function (testing for overall linearity of the model).  Whereas with larger, and more significant significant terms for the non-linear model, there are other problems.

 

This doesn't really matter any way, as we can look at the diagnostic graphs produced by the gvlma package which are straightforward.  It's certainly very difficult to envisage any other reason other than increasing greenhouse gas concentration for the rising temperatures observed.

 

The graphs show clearly that volcanic doesn't do a very good job of predicting anomaly, solar is rather better, and co2 is lock step.  Anomaly versus time sequence for directional stat 4 does a very good job of nixing the asertion that the earth has started cooling.  What we're really seeing there is evidence for an  interesting trend for warming followed by an abrupt temperature drop (with a minimum higher than the minimum of the last period) followed by a further rise.  There's certainly no evidence to suggest that "global warming has stopped", which is one of the most idiotic assertions of the climate skeptic brigade.

 

 

 

Peña, E. A., & Slate, E. H. (2006). Global Validation of Linear Model Assumptions. Journal of the American Statistical Association, 101(473), 341-354. doi: 10.1198/016214505000000637

 

Next up, regression diagnostics over a thousand or so years here: Solar ...

 

Here's an similar graph from Lean, J., Beer, J. and Bradley, R. 1995. Reconstruction of solar irradiance since 1610: Implications for climate change. Geophysical Research Letters 22: 3195-3198:

 

 

 

Truncating the data.

 

climate.summary.end <- subset(climate.summary, climate$Year > 1979)

> summary(my.short.model)

Call:

lm(formula = ANOMALY ~ SOLAR + co2_mean, data = climate.summary.end)

Residuals:

     Min       1Q   Median       3Q      Max

-0.19938 -0.05220  0.03270  0.06071  0.09708

Coefficients:

             Estimate Std. Error t value Pr(>|t|)  

(Intercept) -2.366450   0.794575  -2.978  0.00806 **

SOLAR        0.688377   0.348307   1.976  0.06364 .

co2_mean     0.007756   0.002140   3.625  0.00194 **

---

Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.09009 on 18 degrees of freedom

  (8 observations deleted due to missingness)

Multiple R-squared: 0.4563,    Adjusted R-squared: 0.3959

F-statistic: 7.553 on 2 and 18 DF,  p-value: 0.004153

> gvlma(my.short.model)

Call:

lm(formula = ANOMALY ~ SOLAR + co2_mean, data = climate.summary.end)

Coefficients:

(Intercept)        SOLAR     co2_mean 

  -2.366450     0.688377     0.007756 

ASSESSMENT OF THE LINEAR MODEL ASSUMPTIONS

USING THE GLOBAL TEST ON 4 DEGREES-OF-FREEDOM:

Level of Significance =  0.05

Call:

 gvlma(x = my.short.model)

                     Value p-value                   Decision

Global Stat        8.16221 0.08581    Assumptions acceptable.

Skewness           3.35623 0.06695    Assumptions acceptable.

Kurtosis           0.01176 0.91363    Assumptions acceptable.

Link Function      4.28587 0.03843 Assumptions NOT satisfied!

Heteroscedasticity 0.50834 0.47586    Assumptions acceptable.

plot.gvlma(gvlma(my.short.model))

 

 

So you see some cyclical change in variability.  Is this solar or is it ENSO or is it something else?

 

Scaling the same model is also useful:

 

lm(formula = scale(ANOMALY) ~ scale(SOLAR) + scale(co2_mean), data = climate.summary.end)

Call:

lm(formula = scale(ANOMALY) ~ scale(SOLAR) + scale(co2_mean),     data = climate.summary.end)

Coefficients:

    (Intercept)     scale(SOLAR)  scale(co2_mean) 

         0.4280           0.3665           0.9426 

 

which shows that co2 is 3 times more important than solar in the prediction of anomaly.  With the whole model explaining 76% of the variance you can do the maths on the relative contributions.