Famous Hockey Sticks in History

I do love hockey, it is by far my favorite sport. Probably half of my Canadian readers will stop reading when I say that I am a huge Red Wings fan.

If you came here looking for a NHL discussion though, I apologize this post is about how the false mathematics behind the most famous hockey stick temperature curves.

This post like many others here is in response to a request. This time instead of odd linear curves I used GISS (ground instrument) temperature with the 1357 M08 proxies before infilling. Which means “as close as I can get right now to the real data.”

This post clearly shows that actual HS data can be made to support any conclusion. Hockey stick graphs are not temperature whatsoever but are in fact representations of the statistical distortion created by mathematical sorting of random or near random data.

This entire post is actual proxy data from M08. I used CPS methods recommended in the latest hockey stick paper and many before it to produce these graphs.

The red line is GISS annual data with an 11 year square filter smoothing the blue represents M08 proxies after the recommended math (CPS) is performed. GISS measured temp is about 130 years long.

What happens to the proxies when I apply CPS to temperature centered at the actual correct dates — a beautiful HS which would make any climatologist proud.


I flipped the Red GISS data upside down. — Oops, negative hockey stick. Same methods, different correlation graph.

Another HS like the first one. But wait, temperatures rose before 1900AD?????

Same as above with GISS temperature flipped

Positive again, this time the temp rise is prior to 1800

Negative again?? Temps dropped before 1800???

Positive temp rise again ending at 1600.

Or was it negative at 1600

How much stronger does the evidence of this faulty math have to be? The methods clearly amplify the trend you are looking for no matter what the trend is.

For an even more extreme demonstration see my older post

Will the Real Hockey Stick Please Stand Up?

17 thoughts on “Famous Hockey Sticks in History

  1. The problem is the calibration with only a fraction of the data. I am land surveyor and one of the basics of measurement that is drilled into your head is the effect of the propagation of errors, which can be systematic (instrument bias etc.) and random (no measurement can be 100% accurate and these errors are random). The methodology a land surveyor would follow, if he really needs (and really has no option) to do something like this calibration, is that he would do an pre-analysis of the possible errors in the source data he can expect. i.e. in this case the accuracy of the proxy data (to be calibrated) especially how closely it actually represents the temperature and also the accuracy of the direct temperature measurements. After that he will access how these errors propogate through the whole network, which are going to be amplified at the far end ie. at the oldest data. Thus to prevent large errors to be accumulated one needs very small errors in the calibration. Ideally one tries to find independent verification of the values in some points along the data series to be calculated. i.e in this case farely accurately measured or derived temperature data (eg. Oxygen 18 isotope data if this is accurate enough)

  2. The whole issue of propagation of errors seems to be ignored by many people these days. I would really like to see Mann, Phil Jones, or Briffa do a study of the propagation of errors in their works. I am afraid that the total error would overwhelm the signal. The error bars are only part of the total error.

    Proxies have measurement errors; temperature calibration has errors, there are algorithim errors in the computer code, and the list goes on. Ignoring errors is so much easier than worrying abobut their effects.

  3. @2
    The problem with tree ring as proxy for temperature is not instrumental, it’s physical !
    Tree rings has no more skill to reproduce temperature than red noise. Period.

    The whole thing stinks and it will go to the garbage of science history like phlogistic, the Piltdown man and many other “consensus science”.

  4. Jeff,

    Could you clarify the percent used. The argument from Mann et al. will be that the selected series have a real temperature signal because the number of matches exceeds the number of matches expected from random data. I assume that means that more that 5% of the series are selected but I see some of your matches select as many as 10% of the series.

  5. Raven,

    My percent used was #series/1209. I’m not a statistician. When you consider that most of the series in M08 only need to pass an r>0.1 or r>0.13 with more than 13% (I believe) being considered above random, it is pretty interesting that the non-infilled data can achieve these kinds of numbers with negative trends and higher r values.

    IMO it is because of the 10 year filtering of the GISS (in this case) and the raw data that M08 estimate of autocorrelation effects is wrong. Again I am not an expert in this, I just have some basic feel for it. I’m sure the p calculation had some problem because I keep passing high percentages of the data, sort of a morons monte carlo.

    In my other post where I used linear trends, I think you know I achieved a 40% correlation to a negative linear trend. This is more serious evidence of the problems associated with the methodology I think.

  6. Jeff,

    The match with negative trends is no surprise since many proxies are anti-correlated with temperatures. Mann takes that into account when he does his tests.

  7. Don’t forget, Mann infilled positive hockey stick data on 1083 proxies before his sorting. I achieved a 40 percent correlation to a negative without infilling my own negative data.

    I believe I got a high correlation % to different GISS data plots with a low r>.1 also, I just didn’t post it.

  8. Now that I think of it, doesn’t the infilling of a positive data trend basically mean he is assuming a positive correlation to temp. How do you believe Mann is taking that into account?

  9. It was a very low percentage. Only 18 of 484 proxies were accepted with a negative correlation these should have been flipped positive for averaging. It seems like mostly the weird stuff though. Some odd things here, I was thinking of asking SMcI about them,

    Zhang, D. 1980: Winter temperature changes during the last 500 years in South China . Chinese Science Bulletin 25(6), 497–500 .

    Why would a negative correlation work with this series.

    Linsley, B.K., R.B. Dunbar, G.M. Wellington, and D.A. Mucciarone, 1994, A coral-based reconstruction of Intertropical Convergence Zone variability over Central America since 1707, Journal of Geophysical Research, vol 99, No. C5, pp 9977-9994.

    This one was negative also. Seems a bit strange if the variability was pre-calculated.

    Can I ask your background?

  10. Hmm, I got the impression a larger number were accepted. 18 is not enough to explain a negative series. That said, many of the tree ring series appear to have a negative correlations. The huge number of tree rings in the data set would probably explain why you can get the inverse hockey stick.

    It does not make sense that a *temperature* series would be anti-correlated. I guess that is just one more example of what happens when you process data with a statistical meat grinder and blindly accept whatever pops out.

    I am just trying to anticipate the counter arguments. I suspect they will insist that your other comparisons are meaningless because we ‘know’ the temperature did not change in that way. Similarily, they will like insist those proxies which match the temperature must be valid temperature proxies since we “know” the temperature went up.

    It is absurd but that is the state of climate science. Results that conform to expectations are presumed to be correct because they conform to expectations. People who disagree are told that they must demonstrate that the conclusions are wrong. It is not enough to demonstrate the method is invalid because Mann and co believe that the getting the “right” results proves that the method must be valid. The only way to convince them that method is wrong is to show that the answer produced by that method is wrong.

    I have an engineering background.

  11. Rather late in commenting on this, but very nice to see! I was particularly interested in comparing graph #3 to graph #1:

    While the correlation isn’t quite as good for #3 as for #1, and thus the blade of the hockey stick isn’t as large, it is enough to show that Mann’s algorithm can create a hockey stick out of *any* temperature data! And that is without hand-picking proxies or tweaking the algorithm.

    If it can create an 1800-1900 temperature rise (while keeping the rest of history fairly flat) using real data, then there is simply no reason to believe that our 19000-2000 temperature rise is anything unusual.

  12. Chris H,

    I think you are almost right. The conclusions I made are

    1. The data is highly random

    2. The CPS and r sorting methods used for sorting proxies are highly flawed in their concept.

    3. We have no reason whatsoever to believe that these proxies as a set have any relationship to temperature. (Remember that the pass correlation percentage was the ONLY basis for saying these proxies are temp!)

    I wish we knew what temperatures were in history. Craig Loehle wrote a paper where he took proxy data which is believed to represent temperature took a huge leap of reason and averaged it.

    It’s quite refreshing really. Averaged, nice happy data averaged together, no statistical mallet, no math to pick your favorites out. Just averaged. I think I’ll do a post on that soon.

  13. Raven,

    It really is bad. I spent the night reading statistics again. I’m learning a lot. I have already seen a bunch of misunderstandings in the discussion on Climate Audits posters.

    I don’t think people can get as good a feel for how bad the paleo portion of this science is until they dig into the data. I am interested in taking a crack at some of these climate models down the road. I believe the IPCC has so much control over who’s work gets out that none of the work can be trusted at this point.

    I felt this post and the one which demonstrated patterns were just cool looking results (except for the high correlation to negative temperature). The real argument I have against the CPS correlation sort is the distortion of temperature scale.

    In previous posts I planted signals in red noise and tried to use M08 math to go find them? The signal is distorted and de-magnified in history, even though I knew the exact signal it cannot be extracted using these methods.

  14. Jeff,

    The distortion of the signal that you demonstrated is an extremely interesting result. I realize getting it written up in a peer reviewed paper would probably take more time than you have to spend on the topic but have you considered approaching some climate scientists and seeing if they would be interested in being a co-author?

  15. Going by the amount of “peer pressure” to not say anything (that might be interpreted in any way as) against AGW, even if it is well researched & argued & would move the science forward, I have doubts that a paper would make it past “peer review”. It’d probably have to be published in some fringe publication, the sort that isn’t taken seriously (or read by) most climate scientists. 😦

    There are also probably very very few climate scientists who would feel able to co-author a paper, even if they secretly agreed, again because of peer pressure.

    This isn’t the way science is supposed to work, but then Galileo Galilei knows a lot about that…

  16. I posted this on RealClimate but I didn’t get an answer. It was very late in the thread and basically disappeared off the front page so it’s not surprising. Anyway, do you think it makes any sense?



    Comment by Briso — 16 October 2008 @ 3:29 AM

    >Now a really stupid question. It looks to me like the only lines which go above the early highs generated from proxy data (~960 AD) are the instrumental record data. Does this not show that the proxy data suggests warmer times in the past than during the more recent proxy period? Comparing that to instrumental data is apples and oranges, no?

    >>[Response: No. The proxies are calibrated to the instrumental target just so that they will be comparable. – gavin]

    I’ve been looking at the paper again and trying to understand it. First, an important quote in the context of the AGW issue.

    “Because this conclusion extends to the past 1,300 years for EIV reconstructions withholding all tree-ring data, and because non-tree-ring proxy records are generally treated in the literature as being free of limitations in recording millennial scale variability(11), the conclusion that recent NH warmth likely** exceeds that of at least the past 1,300 years thus appears reasonably robust. For the CPS (EIV) reconstructions, the instrumental warmth breaches the upper 95% confidence limits of the reconstructions beginning with the decade centered at 1997 (2001).”

    Further down on the same page (italics added by me):

    “Peak Medieval warmth (from roughly A.D. 950-1100) is more pronounced in the EIV reconstructions (particularly for the landonly reconstruction) than in the CPS reconstructions (Fig. 3). The EIV land-only reconstruction, in fact, indicates markedly more sustained periods of warmer NH land temperatures from A.D. 700 to the mid-fifteenth century than previous published reconstructions. Peak multidecadal warmth centered at A.D. 960 (representing average conditions over A.D. 940–980) in this case corresponds approximately to 1980 levels (representing average conditions over 1960–2000). However, as noted earlier, the most recent decadal warmth exceeds the peak reconstructed decadal warmth, taking into account the uncertainties in the reconstructions.”

    OK, some questions.

    1. Does the EIV reconstruction represent a forty year moving average as suggested by the part I italicized?

    2. Does the instrumental record shown on the graph in Fig 3 represent a forty-year moving average? I say no, because such a plot would have an end point in 1987. It looks like a five year moving average perhaps?

    3. It is true that the upper 95% confidence level of the peak warmth centered at A.D.960 of the EIV land-only reconstruction is approximately 0.4. I assume that this means that peak of the five year average temperature at that time would have been considerably higher?

    4. Is it not true that whatever the red line in figure 3 is, it is an apple being compared to a pear?

    5. “Peak multidecadal warmth centered at A.D. 960 (representing average conditions over A.D. 940–980) in this case corresponds approximately to 1980 levels (representing average conditions over 1960–2000).” Corresponds approximately? Shouldn’t that be exceeds significantly? If my figures are right, 1960-2000 HadCrut NH 40 year average – 0.06 (68-08 app 0.17), 960 PMW central – app 0.25, 960 PMW upper 95% – 0.4?

    6. Does this paper really show that “recent NH warmth likely** exceeds that of at least the past 1,300 years”?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s