Compare and contrast:
What Tamas says (over 1000 year period)
What that actually means:
Now you can go back and read the rest of the analysis
Hypothesis 2. The sattelite data shows clear warming since 1998.
Extra easy to test!
Grab the data for 1998 until the end of the series:
climate.stupid <- subset(climate, climate$Year > 1998)
run a correlation test of temperature anomaly against year (note I accidentally called the series nih rather than uai):
cor.test(climate.stupid$nih, climate.stupid$Year)
Pearson's product-moment correlation
data: climate.stupid$nih and climate.stupid$Year
t = 0.9917, df = 8, p-value = 0.3504
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
-0.3773652 0.7949023
sample estimates:
cor
0.3308772
An estimated correlation of + 0.33 (i.e. a positive association, the greater the year, the greater the anomaly) but the95% confidence interval shows that this number is indistinguishable from zero - i.e. too little data in this time range.
Let's look at the whole UAI data set:
cor.test(climate$nih, climate$Year)
Pearson's product-moment correlation
data: climate$nih and climate$Year
t = 3.9733, df = 27, p-value = 0.000475
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
0.3099162 0.7965910
sample estimates:
cor
0.6074267
Conclusion: definite warming trend.
Let's compare it with the IPCC warming trend (N hemisphere only)
cor.test(climate.nih.trunc$ANOMALY, climate.nih.trunc$Year, use="complete")
Pearson's product-moment correlation
data: climate.nih.trunc$ANOMALY and climate.nih.trunc$Year
t = 3.3801, df = 11, p-value = 0.006141
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
0.2683242 0.9077603
sample estimates:
cor
0.713782
Definite warming trend, of roughly the same magnitude and direction that the UAI data suggests. In this sense the data are indistinguishable from each other.
While we're shooting fish in a barrel, let's deal with whether the UAH data shows less warming than the IPCC data.
t.test(climate.nih.trunc$nih, climate.nih.trunc$ANOMALY-climate.nih.trunc$ANOMALY[1])
Note we have to perform some giggery pokery on the ANOMALY data so that it's on the same 1978 starting base as the UAH data set.
Welch Two Sample t-test
data: climate.nih.trunc$nih and climate.nih.trunc$ANOMALY - climate.nih.trunc$ANOMALY[1]
t = 1.8254, df = 28.885, p-value = 0.0783
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-0.009231537 0.162297226
sample estimates:
mean of x mean of y
0.15565079 0.07911795
The p value of > 0.05 shows us that the mean IPCC anomaly and UAI dataset anomaly are statistically indistinguishable.
Comments (16)
kenlambert said
at 12:58 am on Aug 8, 2009
So you have expanded the vertical axis by 15 times. What is your point?
Kieren Diment said
at 2:08 am on Aug 8, 2009
It's exactly what Tamas did. Except he also truncated the x axis by 3/100ths as well, to provide an even more dishonest answer to the question.
It's nice to have real (comprenensive) data to demonstrate your point.
Stubborn Mule said
at 9:39 pm on Aug 12, 2009
Kieren, I'd have to say that using correlation between the year and the time series value is not a great way to identify trends in time series. It's easy enough to cook up data that has no real trend and yet will appear, through the cor.test, to have a positive correlation between time and value to a 95% confidence level. Another approach, with values equally spaced in time, is to do a t.test on the diff of the series to get a sense of whether the series has positive drift. Doing this for the full UAH data set gives a mean drift of -0.001 per annum and and a 95% confidence range on the drift of -0.077 to +0.074. However, I wouldn't read too much into this as the data series is very short.
Kieren Diment said
at 7:00 am on Aug 13, 2009
Stubborn:
You're right, I'm no time series afficionado. However, as both are based on the z distribution of the variance of the data, the correlation and your t statistic are actually equivalent in this case (the point biserial correlation is the most well known of these equivalences).
Have you got the R code for your running t? It sounds useful and is certainly more elegant than my correlation statistic.
Just for clarity: there's no cooking up data here. I'm pulling data from a source and using it unaltered to respond to specific claims from the solipsists.
Ken: Stubborn does financial stats. I do social science stats.
Stubborn Mule said
at 8:44 am on Aug 13, 2009
Didn't mean to suggest you were cooking up data: I mean I could cook up a data series with no drift but positive correlation between time and value to prove the point. Part of the problem is that the correlation test is based on the assumption that you are working with a series of samples (x1, y1), (x2, y2), (x3, y3),... from two random variables X and Y, while testing correlation of time and value means that (1) time is not a random variable and (2) if the time series is not (even weakly) stationary, the samples are not independent.
Kieren Diment said
at 9:30 am on Aug 13, 2009
Stubborn:
I get your point, and understand the risk. In this case I'm pretty sure that t and r are equivalent, but in multivariate situations, then this is obviously a hairy way of doing things.
kenlambert said
at 12:12 am on Aug 14, 2009
Boys - just a suggestion from a statisticallly limited engineer - why dont you integrate the areas under the curves and produce a running mean which equalizes the positive and negative areas preceding. Assuming Temperature is a proxy for Energy Forcing which is approx linear, the mean curve obtained will track the energy balance.
Kieren - is Stubborn questioning yout statistical methods above??
kenlambert said
at 12:18 am on Aug 14, 2009
Unless of course - Temp is just like a 'random walk' thru space-time...
See: http://wattsupwiththat.com/2009/08/12/is-global-temperature-a-random-walk/
Kieren Diment said
at 8:42 am on Aug 14, 2009
Ken,
The random walk stuff just shows that the variation exceeds the trend for certain resolutions in the time series. It doesn't question the trend. You could use this methodology to resolve a meaningful period of time within which to assess the trend however.
More solipsism here - "what if the data is just meaningless?"
Stubborn has pointed out that I'm not dealing with time series in the way that convention would normally suggest it should be dealt with. He is correct in this. However in this case what I've done is a statistical equivalent of the conventional approach due to the simplicity of the statistical hypothesis.
Kieren Diment said
at 8:42 am on Aug 14, 2009
Don't see the purpose of assessing energy balance when temperature anomaly is a perfectly good proxy for this.
kenlambert said
at 7:30 pm on Aug 14, 2009
You must have a statistical tool somewhere which sums the preceding areas above and below a running mean fit. It is still a temperature measurement. The procedure woiuld be iterative.
Indeed is the 'random walk' man just wrong in his conclusions?
Kieren Diment said
at 7:38 pm on Aug 14, 2009
The integrative statistics is not my area - in the socal sciences, we don't have sufficient precision to make that kind of evaluation meaningful. I suspect it would tell you the same information as the linear regression as it's just using the same data from the least squares fit.
The random walk guy isn't really thinking very clearly statistically. If his conclusion was true then for a given time series duration, the gradient of the line would have to have an equal probability of being positive, negative or zero at all points along the population of measurement.
Kieren Diment said
at 7:44 pm on Aug 14, 2009
err population of measurement times
Stubborn Mule said
at 9:22 pm on Aug 16, 2009
Kieren. A word of caution in relation to the Welsh two sample t-test you use at the end there. The null hypothesis is that the difference of the means is zero and you got a p-value of 0.0783. The would suggest that, with 90% confidence, you can <em>reject</em> the hypothesis that the means are the same, not the other way around. Of course, I wouldn't get too excited about that. As I argue on the <a href="http://climatekaraoke.pbworks.com/TimeSeries">TimeSeries page</a>, since neither sample is iid, the Welsh test is not very meaningful here anyway.
Stubborn Mule said
at 9:22 pm on Aug 16, 2009
Apparently you can't use markup in comments :)
Stubborn Mule said
at 10:54 am on Aug 18, 2009
Ken: I've taken up the integration under the curve suggestion here: http://climatekaraoke.pbworks.com/TimeSeries
You don't have permission to comment on this page.