| 
  • If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

  • You already know Dokkio is an AI-powered assistant to organize & manage your digital files & messages. Very soon, Dokkio will support Outlook as well as One Drive. Check it out today!

View
 

Idiot

Page history last edited by Kieren Diment 14 years, 8 months ago

Compare and contrast:

 

What Tamas says (over 1000 year period)

 

 

What that actually means:

 

Now you can go back and read the rest of the analysis

 

Hypothesis 2.  The sattelite data shows clear warming since 1998.

 

Extra easy to test!

 

Grab the data for 1998 until the end of the series:

 

climate.stupid <- subset(climate, climate$Year > 1998)

 

run a correlation test of temperature anomaly against year (note I accidentally called the series nih rather than uai):

 

cor.test(climate.stupid$nih, climate.stupid$Year)

    Pearson's product-moment correlation

data:  climate.stupid$nih and climate.stupid$Year

t = 0.9917, df = 8, p-value = 0.3504

alternative hypothesis: true correlation is not equal to 0

95 percent confidence interval:

 -0.3773652  0.7949023

sample estimates:

      cor

0.3308772

 

An estimated correlation of + 0.33 (i.e. a positive association, the greater the year, the greater the anomaly) but the95% confidence interval shows that this number is indistinguishable from zero - i.e. too little data in this time range.

 

Let's look at the whole UAI data set:

 

cor.test(climate$nih, climate$Year)

    Pearson's product-moment correlation

data:  climate$nih and climate$Year

t = 3.9733, df = 27, p-value = 0.000475

alternative hypothesis: true correlation is not equal to 0

95 percent confidence interval:

 0.3099162 0.7965910

sample estimates:

      cor

0.6074267

 

Conclusion: definite warming trend.

 

Let's compare it with the IPCC warming trend (N hemisphere only)

 

cor.test(climate.nih.trunc$ANOMALY, climate.nih.trunc$Year, use="complete")

    Pearson's product-moment correlation

data:  climate.nih.trunc$ANOMALY and climate.nih.trunc$Year

t = 3.3801, df = 11, p-value = 0.006141

alternative hypothesis: true correlation is not equal to 0

95 percent confidence interval:

 0.2683242 0.9077603

sample estimates:

     cor

0.713782

 

Definite warming trend, of roughly the same magnitude and direction that the UAI data suggests.  In this sense the data are indistinguishable from each other.

 

While we're shooting fish in a barrel, let's deal with whether the UAH data shows less warming than the IPCC data.

 

t.test(climate.nih.trunc$nih, climate.nih.trunc$ANOMALY-climate.nih.trunc$ANOMALY[1])

 

Note we have to perform some giggery pokery on the ANOMALY data so that it's on the same 1978 starting base as the UAH data set.

    Welch Two Sample t-test

data:  climate.nih.trunc$nih and climate.nih.trunc$ANOMALY - climate.nih.trunc$ANOMALY[1]

t = 1.8254, df = 28.885, p-value = 0.0783

alternative hypothesis: true difference in means is not equal to 0

95 percent confidence interval:

 -0.009231537  0.162297226

sample estimates:

 mean of x  mean of y

0.15565079 0.07911795

The p value of > 0.05 shows us that the mean IPCC anomaly and UAI dataset anomaly are statistically indistinguishable.

Comments (16)

kenlambert said

at 12:58 am on Aug 8, 2009

So you have expanded the vertical axis by 15 times. What is your point?

Kieren Diment said

at 2:08 am on Aug 8, 2009

It's exactly what Tamas did. Except he also truncated the x axis by 3/100ths as well, to provide an even more dishonest answer to the question.

It's nice to have real (comprenensive) data to demonstrate your point.

Stubborn Mule said

at 9:39 pm on Aug 12, 2009

Kieren, I'd have to say that using correlation between the year and the time series value is not a great way to identify trends in time series. It's easy enough to cook up data that has no real trend and yet will appear, through the cor.test, to have a positive correlation between time and value to a 95% confidence level. Another approach, with values equally spaced in time, is to do a t.test on the diff of the series to get a sense of whether the series has positive drift. Doing this for the full UAH data set gives a mean drift of -0.001 per annum and and a 95% confidence range on the drift of -0.077 to +0.074. However, I wouldn't read too much into this as the data series is very short.

Kieren Diment said

at 7:00 am on Aug 13, 2009

Stubborn:

You're right, I'm no time series afficionado. However, as both are based on the z distribution of the variance of the data, the correlation and your t statistic are actually equivalent in this case (the point biserial correlation is the most well known of these equivalences).

Have you got the R code for your running t? It sounds useful and is certainly more elegant than my correlation statistic.

Just for clarity: there's no cooking up data here. I'm pulling data from a source and using it unaltered to respond to specific claims from the solipsists.

Ken: Stubborn does financial stats. I do social science stats.

Stubborn Mule said

at 8:44 am on Aug 13, 2009

Didn't mean to suggest you were cooking up data: I mean I could cook up a data series with no drift but positive correlation between time and value to prove the point. Part of the problem is that the correlation test is based on the assumption that you are working with a series of samples (x1, y1), (x2, y2), (x3, y3),... from two random variables X and Y, while testing correlation of time and value means that (1) time is not a random variable and (2) if the time series is not (even weakly) stationary, the samples are not independent.

Kieren Diment said

at 9:30 am on Aug 13, 2009

Stubborn:

I get your point, and understand the risk. In this case I'm pretty sure that t and r are equivalent, but in multivariate situations, then this is obviously a hairy way of doing things.

kenlambert said

at 12:12 am on Aug 14, 2009

Boys - just a suggestion from a statisticallly limited engineer - why dont you integrate the areas under the curves and produce a running mean which equalizes the positive and negative areas preceding. Assuming Temperature is a proxy for Energy Forcing which is approx linear, the mean curve obtained will track the energy balance.

Kieren - is Stubborn questioning yout statistical methods above??

kenlambert said

at 12:18 am on Aug 14, 2009

Unless of course - Temp is just like a 'random walk' thru space-time...

See: http://wattsupwiththat.com/2009/08/12/is-global-temperature-a-random-walk/

Kieren Diment said

at 8:42 am on Aug 14, 2009

Ken,

The random walk stuff just shows that the variation exceeds the trend for certain resolutions in the time series. It doesn't question the trend. You could use this methodology to resolve a meaningful period of time within which to assess the trend however.

More solipsism here - "what if the data is just meaningless?"

Stubborn has pointed out that I'm not dealing with time series in the way that convention would normally suggest it should be dealt with. He is correct in this. However in this case what I've done is a statistical equivalent of the conventional approach due to the simplicity of the statistical hypothesis.

Kieren Diment said

at 8:42 am on Aug 14, 2009

Don't see the purpose of assessing energy balance when temperature anomaly is a perfectly good proxy for this.

kenlambert said

at 7:30 pm on Aug 14, 2009

You must have a statistical tool somewhere which sums the preceding areas above and below a running mean fit. It is still a temperature measurement. The procedure woiuld be iterative.

Indeed is the 'random walk' man just wrong in his conclusions?

Kieren Diment said

at 7:38 pm on Aug 14, 2009

The integrative statistics is not my area - in the socal sciences, we don't have sufficient precision to make that kind of evaluation meaningful. I suspect it would tell you the same information as the linear regression as it's just using the same data from the least squares fit.

The random walk guy isn't really thinking very clearly statistically. If his conclusion was true then for a given time series duration, the gradient of the line would have to have an equal probability of being positive, negative or zero at all points along the population of measurement.

Kieren Diment said

at 7:44 pm on Aug 14, 2009

err population of measurement times

Stubborn Mule said

at 9:22 pm on Aug 16, 2009

Kieren. A word of caution in relation to the Welsh two sample t-test you use at the end there. The null hypothesis is that the difference of the means is zero and you got a p-value of 0.0783. The would suggest that, with 90% confidence, you can <em>reject</em> the hypothesis that the means are the same, not the other way around. Of course, I wouldn't get too excited about that. As I argue on the <a href="http://climatekaraoke.pbworks.com/TimeSeries">TimeSeries page</a>, since neither sample is iid, the Welsh test is not very meaningful here anyway.

Stubborn Mule said

at 9:22 pm on Aug 16, 2009

Apparently you can't use markup in comments :)

Stubborn Mule said

at 10:54 am on Aug 18, 2009

Ken: I've taken up the integration under the curve suggestion here: http://climatekaraoke.pbworks.com/TimeSeries

You don't have permission to comment on this page.