the Air Vent

Because the world needs another opinion

TTLS Reconsruction by Steve M

Posted by Jeff Id on June 3, 2009

Steve McIntyre described a method Link HERE to reconstruct historic temperatures using truncated total least squares calibration. This method is interesting because its nature recognizes the AVHRR satellite data is not good for creating trends and instead uses surface station information spread according to covariance with the satellite grid. I spent a good amount of time reading the math and understanding the method so like usual, I’m tired and putting up what I found.

Because it is multivariate regression the potential for odd solutions when you get to higher orders is a problem but I wanted to see how similar it was to RegEM. I set up an algorithm to run SteveM’s code at a variety of PC and truncation levels (regpar). This method is easier to use because of its speed and I like it better than the original Steig reconstruction because it only uses surface station data.

I’ve made a lot of plots but I’ll put up just a few.

Trend = 0.061C/Decade

Trend = 0.061C/Decade

Trend = 0.040 C/Decade

Trend = 0.040 C/Decade

Trend = 0.051 C/Decade

Trend = 0.051 C/Decade

multivar pc=9

Trend = 0.075 C/Decade

Trend = 0.053 C/Dec

Trend = 0.053 C/Dec

Trend = 0.056 C/Dec

Trend = 0.056 C/Dec

The raw surface station data plot by simple area weighting looks like this.


Trend = 0.044C/Decade

And Ryan’s method which minimizes any satellite influence.

RyanO 13pc regpar=9

RyanO 13pc regpar=9 Trend = 0.060C/Decade

All of the surface station data methods show cooling at the south pole with trends leading from 0.04 to 0.075 C/Decade which is substantially lower than the 0.118 of Steig et al.

26 Responses to “TTLS Reconsruction by Steve M”

  1. Jeff, I’m not “developing” a method. All I’m doing is isolating the calibration and TTLS portion from the RegEM portion to try to see what’s driving the bus.

  2. Also I’m not taking any position on whether “AVHRR satellite data is not good for creating trends”. If you’re doing a “reconstruction” of AVHRR data before there was AVHRR data, then you have to use surface station data to do so. Since all the operations are linear, that means that the recon is a linear weighting of the surface stations – with the weights changing according to regpa and PC choices. I’m not opining on the merits of AVHRR data.

  3. There’s a lot better case for increasing PC than regpar. It would be interesting, I think, to tile with 1-3 regpar (no higher) and increase the PCs, perhaps even higher.

  4. Jeff Id said

    I wasn’t intending to say developed, I’m a bit tired so I changed the wording to described.

    Also I completely agree with the high PC low regpar. It makes a lot of sense to me. The reason it isn’t done above is that I’m lazy and simply ran the algorithm until it became horribly overfitted. I’ve done a ton of reading tonight, on just about every topic so that’s all I did.

    #2 – I’m taking the position that AVHRR is not as good for computing trends. I’m happy to, after looking at the data from the NSIDC with JeffC and the satellite drift, surface stations are far better instruments. Dr. Comiso really worked some serious magic to get the data to look as clean as it does. Because of this I’m certain the trend represented is nowhere near as good one might think from the presented data.

  5. Layman Lurker said


    “Dr. Comiso really worked some serious magic to get the data to look as clean as it does. Because of this I’m certain the trend represented is nowhere near as good one might think from the presented data.”

    Jeff, the Comiso processing eliminated some of the chaotic swings in HF amplitude and seemed to bring the trends more into line with surface stations. Did his processing preserve the integrity of the surface station HF signal (and by extension – the covariance relationships) at corresponding AVHRR grids?

  6. Jeff Id said

    #5 My impression was that the cloud noise in the AVHRR often swamped the temp signal but wasn’t able to prevent some of the good signal getting through. They state the accuracy as +/- 2C but the reality is some months it swings ten degrees out of whack. Fortunately anomaly varies by several degrees as well so it comes through giving some signal to lock onto.

  7. Kenneth Fritsch said

    As I recall the Steig paper admitted that the cloud masking, and thus by inference, that the AVHRR data was the weakest link in the reconstruction.

    The point I picked up on was that tacking the AVHRR data, in effect, to the end of the 1957-1982 reconstruction will provide an artificial trend if the AVHRR to surface station temperatures are biased one to the other. I think RomanM at CA was suggesting a way around this problem and still making use of the AVHRR data for the 1982-2006 time period.

    As a peripheral point, I guess when one sees all that AVHRR data from every location in the Antarctic one would like to use it for more purposes than spreading the surface station data to all corners of the Antarctica. On the other hand, can it be shown at this point in time that the AVHRR data and its adjustments are sufficiently “mature” to provide confidence in its use? If the AVHRR data does not correlate with nearby surface stations which source is the more correct? What about the questions that Ryan O and Ryan C raised about the apparent differences between the various satellites measurements?

    These are, I think, all the kinds of questions that will continue to go unanswered or hand waved away by the Steig authors, but that become critical in doing a proper analysis of the paper.

  8. Layman Lurker said


    “My impression was that the cloud noise in the AVHRR often swamped the temp signal but wasn’t able to prevent some of the good signal getting through.”

    So are you able to say confidently that the Comiso processing resulted neither stronger nor weaker correlations when comparing surface stations and the corresponding AVHRR grids?

  9. curious said

    #7 – “If the AVHRR data does not correlate with nearby surface stations which source is the more correct?”

    This still bothers me – what do station by station time of day equivalent measures corelate like? I’ve asked this before in the context of the averaging used on AWS data but no one has picked up on it.

    Could be I’m missing something very elementary or something already covered in one of the posts, so apologies if this is the case – would be good to know from one of the guys at the coal face! Thanks

  10. Jeff Id said

    #9 #7

    The station vs AVHRR correlate to some degree but you can see the noise level in the spread of the scatter plot.

    The pictures didn’t load for me when I looked at this. If I clicked on them they showed up. If you others have the same problem let me know.

  11. Layman Lurker said

    #10 thanks Jeff. I had forgotten about this post.

  12. The Diatribe Guy said

    I’ve followed this for the last couple months now as I’ve seen all this complex analysis, scientific papers, computer modeling and so on. As a career actuary, my opinion of fancy modeling versus common sense is no different than it was prior to the original Steig paper.

    In business practice, when money and profit is on the line, I’ve trumped many a model output in my day with common sense and much simpler approaches. I have yet to remember a time where that wasn’t the superior approach. Basically, I’ve found that all these tools can simply confirm what is already suspected. Seldom do they provide a surprising result that is, in fact, reality.

    I very much enjoy all of this – don’t get me wrong. But I pretty much live by the 80/20 rule, and will continue doing so.

    Now… where’s my slide rule? (OK, I admit… I’m not quite that old…)

  13. Jeff Id said

    #12, I was asked to do an analysis of noisy data for an optical coating company to look for trends once. They wanted to isolate what was causing failures in their coatings from an incomplete set of data. They had some plots trying to look for relationships and wanted something extracted.

    My answer was that in statistical analysis if you don’t know the answer before you begin you aren’t likely to find a good one after. I gave some recommendations for additional testing to isolate the variables rather than go through JeffEM or something.

  14. Kenneth Fritsch said

    What I should have said is: if the AVHRR and nearby surface station data trends (1982-2006) differ which one is correct? Jeff do you have a link to these data?

  15. Jeff Id said

    The data I used is the AVHRR Comiso dataset and the SteveM surfacestation data. There are a large number of R functions which download the data but it’s too large for spreadsheet. Sorry I can’t remember if you use R.

  16. Jeff Id said

    #14, My opinion is that it’s not a close call and the surface stations are more accurate. However, an opinion doesn’t count for much, especially from a non-climatologist;).

    I think that a statistical analysis of the AVHRR data would show a great deal more deviation from true temperature. In Steig et al, the authors were forced to clip anything in the sat data greater than 10C from climatology, I don’t know if you have read the paper.

    This basically means the satellite data which measures surface skin temp rather than air temp had monthly temperature averages which drifted more than 10C from a normal year!. I remember seeing deviations of as much as 15C. When you’re looking for a trend of tenths of a degree per decade you know better than I that this makes a big difference.

  17. curious said

    #10 – Thanks, a post I’d also forgotten/not fully absorbed. I’ll go back and (re) read more posts – I still have a niggle but it could be I’ve missed it.

  18. Kenneth Fritsch said

    I don’t know if you have read the paper.

    I have read it 3 times. Should I read it again?

    Jeff, what I was asking is whether anyone has compared the trend from all of the surface stations (1982-2006) to the trend of the corresponding AVHRR data from close proximity to the surface stations. That is of course not the same as the high frequency correlations.

    I suspect that the AWS and manned surface stations have some measurement issues also but probably not what the satellite AVHRR could potentally have. I suspect that the measurement error bars are considerably larger, or at least more uncertain, then the Steig authors might be inclined to admit. To me the Steig excercise was to obtain a warming trend in the Antarctica from some point in time that they could show was statistically different than zero and that does not allow for a lot of uncertainty.

  19. Ryan O said

    #18 Yes. That is what I did in all of my calibration steps. At some point, I’ll put up a post dedicated to just that. I’m working on something else at the moment, though . . . so I’ll have to get back to your question later, Kenneth, if that’s all right. 🙂

  20. Jeff Id said

    #18, Sorry about the idiotic question. I was going too fast and considered that many of the things you’ve done you might already know, it’s a compliment. I’m really too busy to be running a blog, a company and a family.

    I have run the AWS and AVHRR correlation vs trend on one post here. It was a short post but they don’t match well, trends and correlation AVHRR vs SST are randomized. I’m busy again tonight so it will take until tomorrow before I can look.

    Stupid Red Wings!

  21. timetochooseagain said

    Anyone planning on writing up a layman’s guide to the Steig controversy when this is all said and done? My head is spinning from all the statistics I never learned in High School…

  22. Kenneth Fritsch said

    Jeff ID, I am really amazed how you guys, who have been doing all these analyses and are fully employed, and in your case, Jeff, running a business and blog, can keep up your paces. And believe me, I am not at all abused by your question of reading. My answer was meant to be a bit of dry humor –and actually with some serious content, as every time I read the Steig paper I see different nuances.

    Jeff, I still think that your Red Wings are the better team and particularly so on the offensive end. Hot goalies in the play-offs can make a big difference, however, and the Penguin goalie, Fleury, made a kick save last night when I would swear his whole body was off the ice. Crosby, Malkin and Fleury could be difference makers, but will be less so on the road. With the relentless Red Wing attacks, I can see where a goalie might eventually start to crack.

    Somewhere into the game last night an announcer mention something about fewest penalties in 64 years but I was not sure in what context. I saw 2 high sticking calls that in the (good) old days would have been considered unintentional and not called. I also remember when the NHL was being criticized for dumping the puck into the offensive zone instead of skating it in like the Russians and Europeans used to do and with good success. I guess everybody dumps it in these days.

    Ryan O, do what you have been doing and I will continue to look forward to reading your analyses. I have learned better patience in my retirement. I am completing three major yard projects and a tropics surface to troposphere temperature analysis this week and really enjoyed doing them because I did feel pressure to complete in any particular hurry.

  23. I think I’ve just come up with an interesting method for getting a trend from 2-D data like this…
    One of the successes that was reported by the Connection Machine (a massively-parallelised processor architecture produced by Thinking Machines Corp) according to Paul Hoffman is “visual adaptation” – moreover, it ‘learned’ to do this.

    Maybe we could apply visual adaptation algorithms to 2-D data sets to remove noise from them.
    Then we wouldn’t need dubious statistical methods like PCA.

    Is there anything in this or am I talking nonsense? AI not being my field, I’m not sure what is currently possible.

  24. Mark T said

    PCA, by itself, is not really a “dubious statistical method.” Like any statistical method (at least, those that are not immediately straightforward), it can be tortured into giving up the answer one seeks. When that answer was otherwise blind to the seeker, there is a problem. Variants of PCA are widely used in so many applications the average person would be hard pressed to avoid them in a normal day. PCA algorithms are also learning algorithms, at least, they can be implemented in an adaptive manner (online), which is not unlike what Paul Hoffman was talking about, if only a little simpler.

    Btw, I remember when Thinking Machines was founded. I wanted to work there. Seems silly now, given what I have accomplished in the time since they went bankrupt. Computers are so powerful now that the adaptive processing they were doing then can be handled on a laptop using MATLAB now. Inverting enormously large autocorrelation matrices (often a fundamental concept in many adaptive algorithms, PCA included) can be done in the blink of an eye. In fact, I just inverted a 100×100 matrix in about one second on a two-year old desktop using MATLAB 2008a. When Thinking Machines came onto the scene, this probably would have taken weeks to do. Sigh…


  25. PCA, by itself, is not really a “dubious statistical method.” Like any statistical method (at least, those that are not immediately straightforward), it can be tortured into giving up the answer one seeks.

    True… which is why a more autonomous (‘learning’ier) algorithm would have the advantage that there are less decisions to be made by the operator and thus less degrees of “torture” freedom – thus making falsification more difficult. Of course, when we feel the need to develop algorithms that protect science from dishonesty in its practitioners, something is horribly wrong – and that something leads to bad science like AGW (imho).

    As for TMC, I find it interesting that chip manufacturers are now turning to parallelisation, and the ‘cloud’ paradigm can be seen similarly. If I were a couple decades older than I am, I’d probably have wanted to work there too. (TMC were already in decline by the time I was born.

    I’m babbling now, aren’t I? Sorry.

  26. Mark T said

    With PCA it is simply the number of PCs to retain. In many instances, such as multipath in a US cellular phone system (called PCS, but that would be confusing) the PCs are obvious. You also have a priori knowledge of their structure. Blind applications, not so much the same, and subjective selection criterion must be implemented. It is much easier to do, btw, when you have known noise distributions. The only known noise in the temperature readings is the uniformly distributed (assumed) measurement noise. Exactly what other “noise” would be present? Describe it physically and I’ll consider it.

    Oh, TMC was peaking when I was an undergrad… sigh.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: