Hockey Stick Graphs a Series of Epiphanies (Part 1)

Ok, it’s time to summarize what I have learned from exploration of proxy data. I went back to Climate Audit and found the exact moment that I learned how hockey sticks were made.

You can see my dm little mind working trying to grasp what has become so obvious now. My first post:

  1. jeff id
    September 3rd, 2008 at 7:42 pm

    You know I only work in Optics. Its a more simple field which still doesn’t hold all the answers either. When I see that of 1209 records only 484 pass the p=.1 test for measured records and the measured records have problems, how can we make any real conclusion. I understand the reasoning for an initial filter of the data. I really have a hard time resolving why there is this unreadable algorithm to combine the accepted data. It doesn’t seem kosher and it wouldn’t pass the stink test in my field. I’m pretty sure it couldn’t be published at this point.

    What is happening here? Is it as bad as it looks? Am I missing something?

    I just have to ask, if you put together a bunch of random lines or mildly significant data sorted by the single requirement that they meet a rising curve at the end, you will get a flat average with a peak at the end. I really have to question whether the statistical variations in the data which eliminated 60% of the initial sets are misunderstood processes which exist throughout the graphs of even the “significant” data.

    The net result is obvious, a peak with a flat curve before it.

    I just started coming to this site recently, is this the point? I mean beyond the implications of the curve with the spike above.

  2. 28
    reply and
    paste link

    jeff id
    September 3rd, 2008 at 8:21 pm

    It’s not enough for Mann to have the AGW peak at the end, they need the medieval warming and the Maunder minimum to be resolved. Am I getting this?

    So I’m sorry to fill this post with so many questions, some feedback would be helpful, but unless I am totally off base, one thing we can say for sure is that any “trend” in historic data from these datasets would to a “high degree of certainty – IPCC nomenclature” be significantly rounded from peak actual temperature values.

    Is this what the rest of you see?

You know what I got for an answer, check this out

September 3rd, 2008 at 11:57 pm

jeff_id (25-28):

You are going through the same questions that I went through when I found this blog over two years ago. (BTW, I came here with no particular viewpoint, and after reading for a while.) I kept saying to myself, “No, it can’t be that bad!” But I kept finding out that it was that bad, and worse. It kind of reminds me of watching the movie “Dangerous Liaisons”, thinking repeatedly that the characters couldn’t keep stooping lower, but they did.

I think you will find it worth your while (as for other newbies) to spend a good amount of time methodically reviewing the archives here. Start with Ross McKitrick’s “What is the Hockey Stick Debate About?” Also read the Wegman Report. What struck me about both papers is how quickly they went from the issue of the (in)correctness of the hockey stick (there are lots of mistakes in science) to the bigger issue of how quickly and uncritically it and similar papers were accepted by the climate science community. I agree with them that this is the much more important issue.

As for the issue of intent, we are discouraged from speculating on motives here. But since you are statistically savvy, try this thought experiment as you go through the archives. Start with the null hypothesis that these are random errors that could go either (any) way in terms of your result. Keep track of which “direction” the errors tweak the data. What p-value do you end up with for your null hypothesis?

I tell you what, better advice than that is hard to come by.

My next thought was to question Real Climate about what I had just learned. I pointed out that sorting of data seemed like would cause temp graphs to be distorted by the mathematics. Gavin Schmidt had replied to my questions in the past. Instead of a reply, my post was CUT, CENSORED FROM THE DISCUSSION. A engineer who wanted answers but nothing. I realized I had stepped on a hot button.

Well many of you know what happened next. I had to have answers, it’s my nature. I dug into the data and my blog changed completely. After starting this blog only a couple weeks earlier for entertainment now I was buried in proxies-I had to look up what they exactly were, talk about green! Noconsensus was meant to refer to the concept of group agreement rather than global warming, my intent was originally to vent about politics, global warming, energy and a bunch of other things.

All I could think of was what is the best way to demonstrate that the sorting of data for a trend would create artificial temperature distortions, it seemed so obviously wrong there must be a way. After putting in several hundred hours into this study now only 6 weeks later, I have proven the distortions in the temperature graphs to my satisfaction beyond a shadow of a doubt. Talk about obsessive, wow.

I find it interesting to look back at where I was only 6 weeks ago. A regular poster called bender mentioned correctly that I was messing up the thread, I wasn’t worried about that though.

jeff id:
September 4th, 2008 at 6:00 am

I’m sorry to push the thread off topic. When I started reviewing AGW science more seriously last year, I didn’t know anything about paleoclimate temperature reconstruction. I had no preconcieved notion of who was right and flatly refused to make any conclusions. I hoped it would be a bit easier to sort out not wanting to become a climatologist after all, no offense. I have read a couple dozen scientific papers on the topic, this is my first reading of Mann’s hockey stick methodology.

I am surprised to say the least, that this is what you have been fighting with and the same was done with tree ring data. I could barely sleep last night after figuring this out. Instead of resolving anything, I need to delve deeper into the reconstructions. Maybe I’ll need to find the datasets myself and write some of my own code.

Thanks for the help. I’ll leave you alone now.

Bender replied that he didn’t want to put me off, but it was only today that I went back to that thread. I never read his reply because I wasn’t put off at all.

Funny stuff looking back now, I really don’t want to be in the middle of this battle but now I am working towards a publication — strange world for sure. I got some help from CA to download the data and made some initial plots, the first graph is very interesting to me even today.

The blue line is a plot of the data which was accepted by M08 with a simple average 484 proxies used averaged together. The purple/pink line is the rejected data from the same plots. I spent a long time looking at this graph. How can they be mirror images, it doesn’t make sense! The green line is the magnitude of the difference, clearly the difference is greatest in recent years. The green line demonstrates clearly that magnification of local (1850-2000)data has occurred once you understand it.

I will continue this series later, in the meantime ask yourselves why are these graphs nearly mirror images. Does it make sense?


Added Oct 15,08

I tell you what. Z below, really nailed it but look at the assumption he started with.

He said

“if you need to generate a certain profile from a bunch of data which average out to a basically a straight line then ”

His assumption is my conclusion. Z must be a lot smarter than I am or he must have already decided that the data is purely random from something else.

This was my first big revalation, these proxies are supposed to represent temperature, there should be a fairly strong repeated pattern in them somewhere ….anywhere. But there is nothing, the flatness of the green line from 1850 (end of the calibration range) to 1180 is amazing. Somewhere in that line I expected to see a hump, a point where most of the data agreed and the rejected data took on a bit of shape like the accepted data.

There is nothing. The data mirrors nearly perfectly, so Z above makes the assumption that the data is actually random and trendless and the light bulb goes on – everything makes sense.

You wouldn’t believe how many hours I stared at this graph and the code that made it looking for problems. I kept asking myself “where’s the trend.”


Added Oct 16,08

Added by request below.  Same graph as above with filtering removed from green line.

10 thoughts on “Hockey Stick Graphs a Series of Epiphanies (Part 1)

  1. They aren’t *exact* mirror images but if you need to generate a certain profile from a bunch of data which average out to a basically a straight line then “straight line” – “desired profile” will equal -“desired profile” mostly because if you take their slopes, you get 0 – dT/dt which does equal -dT/dt (unsurprisingly).

  2. Jeff, you are to be congratulated on learning the topic and producing such a remarkable piece of work from scratch over such a short period. I cannot wait to see it published.

    I see the problem as being getting it out into the wider world. After all M & M at CA have been trying for years. The establishment (IPCC, governments, most media and greens) have firmly nailed their flag to the mast and don’t want to hear that the iconic symbol and basis for AGW is invalid (or more correctly a fraud perpetrated and encouraged by a small clique of like-minded (or negligent, in the case of peer reviewers; take your pick) supporters). How do you get the message across?

    Does anyone have any idea how to get the message out? In the UK the BBC wouldn’t touch this with somebody else’s barge pole, although there are one or two people in certain newspapers who would publicise it.

  3. Jeff

    As well as having a tremendous skill for unravelling complex mathematical relationships you also have a gift for communicating this to your readers in a relatively easy to understand way.

    This is important & will make it difficult, nigh well impossible, for Mann et al & the likes of RC & people like Tamino to refute it. I suspect one of the reasons they didn’t post your comments on their sites is they don’t know how to answer it. I asked Tamino the innocuous question of what he thought of your analysis but the comment didn’t get past moderation. Open Mind? What a joke!

    It is important your message gets out there. I’m posting on as many relevant websites as I can. I understand you are working towards publishing. Excellent; have you considered writing to PNAS on this?

    My opinion is it was a big mistake for Mann et al to publish this paper. The defence of it by the C.A.G.W. hierarchy will be a bigger one.

  4. Jeff,

    I agree with Z and your response, but I have concerns about the graph and the scales. Can you explain further? Is the mod(accepted-failed) on a different scale? Why does it not go to zero at times? What am I missing?

  5. The green line has some filtering. Its been a while but I think it had a twenty year filter so gaps get bridged. Sorry about the confusion.

    This is why I am studying R so hard, people will be able to see the methods and tweak the results themselves. In C++ the code is gigantic.

  6. Jeff,

    Take 1200 time series of random data that averages to a flat line.
    Select those that match a pattern.
    The remainder should be the inverse of that pattern because if X + Y = 0 then X = -Y

    Why don’t you average all of the series together?

  7. Jeff,
    You have to redo your graph. The green line is too confusing and if we take just the legend, it cannnot be exact. I like what you do but you have to be precise because you are in a dangerous ground.
    Take out the green line then make a second grph with the difference, the filter and correct explanations.
    It is so easy to take your graph as it is and to show to anybody that you are writing wrong things. It looks like a manipulation.

  8. Raven,

    This isn’t supposed to be random data, it is actual proxy data. These are M08 proxies!

    If you know that already from reading above, I have averaged the proxies together in several other posts.


    I made the green line the way I did because it was the clearest representation of the temp distortion created by preferential sorting. I will try to find the spreadsheet and put the unfiltered version up also.

  9. Jeff says:
    “This isn’t supposed to be random data, it is actual proxy data. These are M08 proxies”
    If it looks like duck, quacks like a duck….

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s