the Air Vent

Because the world needs another opinion

RC Replies – Dr. Steig Claims Overfitting

Posted by Jeff Id on June 1, 2009

Despite the clear result of RyanO and quality of match to station trends, Dr. Steig claims that Ryan’s results and Jeff C and my own are the result of overfitting. Something which I believe has been reasonably well addressed here in numerous posts. However, before we take time to have fun with this let’s let RC have their moment and read what they have to say with an open mind.

– At least they’re not telling me to take matlab classes anymore.

On overfitting

H/T to the Lurker for pointing this out.

For those who’ve followed along, don’t forget to ask some questions. Save them before you ask so that you can copy them here if they are clipped.

And please be courteous and polite.

79 Responses to “RC Replies – Dr. Steig Claims Overfitting”

  1. Kenneth Fritsch said

    I have been hopeful that a coauthor of the Steig et al. (2009) paper would reply to the analyses performed here and at CA and it would appear that is what Eric has done, at least in part, with his overfitting claim against someone he says is referrred to as Ryan O.

    My first read of Steig’s thread introduction at RC was not very satifying with regards to his evidence for overfitting. His reference to North’s 1982 paper as a guide to chosing the optimum number of PCs seems misdirected as my reading of that paper might conclude in the opposite direction. His examples of physical representations of his PCs still is stuck on naming 2 and without any further evidence of the connection.

    But that is just my layperson’s view on a first read and since we have much better qualified people to respond to Steig’s post, I will hang up now and listen for the answers.

  2. Layman Lurker said

    Just posted the following at RC (awaiting moderation):

    “Dr. Steig, the choice of 3 PC’s, a low order processing scenario, does not allow peninsula warming to be properly expressed. It does not allow the possibility that peninsula warming may be an entirely regional phenomenon. By definition, the “predictor” impact of a cluster of correlated, warming stations cannot be regionally constrained in a 3 PC reconstruction. Please clarify why such a scenario would not yield spurious smearing.”

  3. Jeff Id said

    #2 My comment was clipped. I didn’t address the science but simply explained that nobody was trying to embarrass him or his comrades. The science will come later.

    Perfect question BTW.

    UPDATE: It’s back in moderation again. It was gone for a bit.

  4. dhogaza said

    I didn’t address the science but simply explained that nobody was trying to embarrass him or his comrades.

    There’s a comment I can read with my own eyes, made by you, on WUWT, that give the lie to that claim.

    And, yes, when choosing 13 PCs overfitting is the first question that comes to mind.

    I’ve not been impressed by the quality of your earlier claims, nor was Tamino, a professional statistician, so didn’t bother to read this latest work by you and RyanO.

    But everything Steig says seems entirely reasonable.

    REPLY: If you wouldn’t mind, show me where I screwed up and said something inappropriate so I can apologize.

  5. Ryan O said

    I didn’t reply to the overfitting claim itself at RC (as that is easily shown to be false from our verification statistics), but I must admit I am curious. Please explain how using additional PCs results in “overfitting”. What, exactly, is being “fit”?
    Now, if you choose to complain that using too high of a regpar setting in RegEM leads to overfitting, you have a valid claim. If you choose to complain that using too many PCs in a calibration process leads to overfitting, you have a valid claim. But claiming that a data reduction process on the AVHRR information is somehow subject to “overfitting” is, quite simply, a false claim. Nothing is “fit” during a PC decomposition.

  6. Ryan O said


    I figured I’d repost my reply here as well as at CA. It hasn’t made it through moderation yet.


    As the “Ryan O” to which you refer, I would like to have the opportunity to respond to the above.

    First, the discussion in North about statistical separability of EOFs is related to mixing of modes. Statistical separability is never stated or implied as a criterion for PC retention except insofar as degenerative multiplets should either all be excluded or all be retained. I quote the final sentences from North (my bold):

    The problem focused upon in this paper occurs when near multiplets get mixed by sampling error. So long as all of the mixed multiplet members are included, there is no special problem in representing data and the same level of fit. However in choosing the point of truncation, one should take care that it does not fall in the middle of an “effective multiplet” created by the sampling problem, since there is no justification for choosing to keep part of the multiplet and discarding the rest. Other than this, the rule of thumb unfortunately provides no guidance in selecting a truncation point for using a subset of EOF’s to represent a large data set efficiently. Additional assumptions about the nature of the “noise” in the data must be made.

    The criteria set forth in North do not suggest, in any way, shape or form, that only statistically separable modes should be retained. As statistical separability is not a constraint for either SVD or PCA, mixed modes often occur. Indeed, there is a whole subset of PCA-related analyses (such as projection pursuit and ICA) dedicated to separating mixed modes. PP and ICA by definition would not exist if statistical separability were a criterion for PC retention, as both PP and ICA require the PC selection be made ex ante. Calling statistical separability a “standard approach” to PC retention is unsupportable.

    Second, as far as verification statistics are concerned, the improvement in both calibration and verification using additional PCs is quite significant. This obviates the concern that the calibration period improvement is due to overfitting. I have provided fully documented, turnkey code if you wish to verify this yourself (or, alternatively, find errors in the code). The code also allows you to run reconstructions without the satellite calibration being performed to demonstrate that the improvement in verification skill has nothing do to whatsoever with the satellite calibration. The skill is nearly identical; and, in either case, significantly exceeds the skill of the 3-PC reconstruction. The purpose of the satellite calibration is something else entirely (something that I will not discuss here).

    Thirdly, this statement:

    Further, the claim made by ‘Ryan O’ that our calculations ‘inflate’ the temperature trends in the data is completely specious. All that has done is take our results, add additional PCs (resulting in a lower trend in this case), and then subtract those PCs (thereby getting the original trends back). In other words, 2 + 1 – 1 = 2.

    is misguided. I would encourage you to examine the script more carefully. Your results were not used as input, nor were the extra PC’s “subtracted out”. The model frame of the 13-PC reconstruction – which was calculated from original data (not your results) was used as input to a 3-PC reconstruction to determine if 3 PCs had sufficient geographical resolution. The result is that they do not. I refer you to Jackson ( ) for a series of examples where similar comparisons using real and simulated data were performed. Contrary to your implication, this type of test is quite common in principal component analysis.

    Lastly, I take exception to the portrayal of the purpose of this to be, in your words:

    It appears this is a result of the persistent belief that by embarrassing specific scientists, the entire edifice of ‘global warming’ will fall. This is remarkably naive, as we have discussed before. The irony here is that our study was largely focused on regional climate change that may well be largely due to natural variability, as we clearly state in the paper. This seems to have escaped much of the blogosphere, however.

    Nowhere have I stated my purpose – nor have I ever even implied – that my analysis makes any statement on AGW whatsoever. The purpose was to investigate the robustness of this particular result.

  7. neill said


    are you really claiming that you located something insulting Jeff wrote in the comment thread following the article — yet you couldn’t be bothered to read the article itself?


  8. Jeff Id said

    I was surprised that Dr. Steig referred to the verification issue. He’s completely missed the fact that his stats rely almost exclusively on short term variance. It’s so extreme that it’s almost moot which dataset is used in ‘verification’.

    I think that’s why an even heavier correction would be appropriate for the sat data. The intent would be to force matching with surface trend while maintaining short term covariance info.

  9. Ryan O said

    Indeed. Also, I think I’m going to limit any other replies by me on RC to items already covered in the posts, including the satellite issues. 😉

  10. hunter said

    RC knows implicitly that if their valiant effort of orthodoxy enforcement fails even a bit, AGW in fact does fall.
    Therefor every critique of AGW must be disproven, and every promotion of AGW must be defended. And if disproven means to ‘attack with untruth, ad hom, dissembling, etc.’, then so be it.
    Schmidt and pals have a tough, and at the end of the day, impossible job: They have to prove something that does not exist, yet claim it is science. There only hope is to draw attention away from those who point out that AGW is in fact a closet full of invisible suits.

  11. Ryan O said

    #10 In truth, this is not a critique of AGW (though they seem to be taking it that way). It is a critique of the method for performing this type of reconstruction. While this may have some impact on other similar reconstructions, it certainly does not constitute a rejection of AGW. I think the larger point would be to attempt to undermine any independent analysis as categorically insufficient in order to call into question the credibility of anything posted at the skeptic blogs.

    More of a PR thing than a science thing.

  12. Didn’t Dr Mann say during the Hockey Stick controversy that Preisendorfer’s Rule N is the “standard rule” for determining how many PCs to retain? Funny to see him as co-author on a paper that uses a different rule. Or am I missing something?

  13. Andrew said

    I just want to expand on what Hunter said.

    If AGW was a fact, people would find a way to deal with it as such. As it is, facts are boring.

    What works better is the imagination. All kinds of imaginary problems and solutions can be dreamed up.

    AGW promoters have chosen the imaginary to promote their product. Which works great in some respects, but everything has an upside and a downside. The upside I’m sure is quite enjoyable.


  14. Sparkey said

    Bishop Hill #12
    I was just going to post the same question. Sigh…

  15. Jeff Id said

    That’s about it for me again. Dr. Steig says.

    [Response: No comments of yours have been deleted that I’m aware of, at least not to this post. Still, I’m not at all interested in debating you — I’ve got much better things to do. Let me be very clear, though, that I’m by no means claiming our results are the last word, and can’t be improved upon. If you have something coherent and useful to say, say it in a peer reviewed paper. If your results improve upon ours, great, that will be a useful contribution.–eric]

  16. Ryan O said

    I’m still waiting for my comment to clear moderation . . .

  17. Kenneth Fritsch said

    I say let Ryan O discuss the pertinent issues with Eric Steig and others of us can try to keep the discussion on track by dropping references to AGW, Mann and other peripheral issues and keep our comments on the issues at hand.

    I think we can learn from Steig’s replies his capabilities in using the methods the authors of Steig et al. applied in their paper and his understanding of those methods.

    I was totally taken aback by Steig’s reference to North (1982) and thought that I had surmised its content correctly and was pleased to see Ryan agree and excerpt the pertinent paragraph from the North paper. So far Steig has really not explained the meaning of his term over fitting in PC retention or any alternative method/criteria of selecting PCs.

    His initial anecdotal example of over fitting could just have well made a case for retaining only 1 or 2 PCs and not 3 –and assuming, of course, that no other information was available.

    The tone of Steig’s replies can be considered nothing if not condescending and might make an undergrad, or even a grad student, back down but I do not see it being effective in a discussion with Ryan. I am hopeful that a more detailed discussion will be in the offering, but I will be pleasantly surprised if it does.

  18. Plimple said

    Ryan O,

    I was impressed by your analysis in two parts shown here and on WUWT. In fact, pending verification that you’d carried out the inter satellite calibrations/corrections in an adequate manner, I believed that you had indeed shown your method to be superior to Steig’s. Your verification statistics seemed to validate your a priori assumptions regarding PC retention. Perhaps I should have read your original posts better, but I had missed the section of your method description where you adjusted the AVHRR data using the surface observations. I’m not sure this is a vlaid approach. Dr. Steig is now claiming that your method differs from his on this key point. If this is the case then that would certainly present a plausible explanation of both why your method seems to yield improved verification statistics and why it yields differing results from Dr. Steig’s method. Do you agree that this is why you both get different verification stats using the same number of PCs?

    I note your criticims of Steig’s a priori choices for PC selection. Based upon his a posteriori analysis of the PC selection, assuming his numbers are independent from your based on the discussion above, do you agree he made the correct selections?

  19. Eric Anderson said

    “If you have something coherent and useful to say, say it in a peer reviewed paper.”

    Ah, yes. Run back to mama and hide. The ol’ “I can’t hear you, unless you’re in the club” mentality.

    Too bad Steig isn’t willing to open up a bit more — It seems a valuable debate/discussion could be had and that everyone would benefit from it.

  20. Andy said

    “I’m not at all interested in debating you — I’ve got much better things to do.”

    And this man claims to be a scientist?

  21. Jeff Id said

    #17, I think Steig will reply to Ryan because he didn’t snip my reference to it, we’ll see.

    Ryan’s argument was pretty devastating though. I mean not just a little, but really a strong reply.

    The tone of Steig’s replies can be considered nothing if not condescending and might make an undergrad, or even a grad student, back down but I do not see it being effective in a discussion with Ryan.

    I’d have to agree, it doesn’t have any affect on me either. It’s just a bit childish to pretend to have obvious superiority. The thing about it is that this isn’t the most complex thing in the world. It’s not like they are doing quantum physics or turbulent flow dynamics, it’s linear algebra!

    The other thing is that there is no hard and fast rule on how many PC’s to retain, although Ryan found some better rules of thumb. A reconstruction matching the data better is generally going to be a better reconstruction.

  22. Plimple said

    Ryan O,

    Just to clarify, I mentioned that I skimmed over your calibration methodology. When I first looked at figure 5:

    I initially thought this was an offset derived from your analysis of the literature and its recommendations for offsets for the different satellites and time periods. I had no idea that this was merely an empirically calculated offset based purely on an assumption that the ground stations represent truth. Aside from noting other researchers observation of this phenomenon, can you provide any other justification for doing this?

  23. Ryan O said

    #18 No, the satellite calibration is a separate issue. The answer to your question (as to whether it affects the verification statistics or changes the trend) is here:

    It does neither.

    Steig’s criticism on this point is not valid (at least not the way he presents it). There is a valid criticism of my ad hoc adjustment, however, which is why I ran the reconstruction both ways. In Steig’s case, he is claiming that we are adjusting the satellite data in order to achieve our desired result. If he had followed how this on-line critique of his paper had unfolded, he would realize this is false. One of the first things I did was the calibration. I didn’t do a reconstruction until well after the calibration had been completed – so there was no way for me to “tune” the calibration to give the answer I desired.

    Additionally, any time you splice two different measurement methods together, you must calibrate them. It is not an option. It is Steig’s burden of proof to show that splicing the 3 AVHRR PCs on the end of the ground data does not result in spurious trends. It is Steig’s burden of proof to show that the satellites and ground data are measuring identical things. It is Steig’s burden of proof to show that the satellite data can be used – as is – without any calibration to the ground data.

    It is my contention that they are not measuring the same thing.

    Indeed, Steig himself recognizes this problem in his paper, which is why he presents a reconstruction using detrended AVHRR data. The only purpose for doing this was to show that trends in the satellite data did not affect reconstruction trends – i.e., justify not doing a calibration.

    Along with that, Steig cites Shuman (who did a 37GHz study) as having done work that supports the conclusions in the paper. In Shuman, the first thing that was done was an ad hoc correction to the satellite data using an offset and a fit to a sine curve – in other words, exactly the same thing I did. Apparently, this kind of calibration is only valid when it supports the conclusions of Steig’s paper.

    Steig’s own paper contradicts the arguments he is putting forth concerning the calibration to ground data.

  24. Andy,

    Oddly enough, most scientists don’t spend that much of their day debating people in blog comments, though perhaps scientists who (semi)-actively blog should. Now, if Nature accepts a response from Ryan et al, he would be forced into one.

    From past experience Eric tends to be a fairly nice guy. A bit of email correspondence to discuss and clarify issues of disagreement might not be remiss before any bridges are burned, though it will be interesting to see the response to Ryan O. once it makes it through the moderation queue.

    Part of the problem here is that it was likely the WUWT post entitled “Steig et al – falsified” that caught Eric’s attention (and likely annoyed him a bit). Thats a far cry from Ryan’s more academic remark that “I am perfectly comfortable saying that Steig’s reconstruction is not a faithful representation of Antarctic temperatures over the past 50 years and that ours is closer to the mark.” After all, given that both sets of analyzes showed similar trends over the same period (albeit considerably lower in the case of Ryan O’s analysis) hardly amounts to a falsification.

    Furthermore, Eric’s claim that some critics conflate this single paper with “falsifying AGW” is taken straight out of the top of the comment thread at WUWT: Pierre Gosselin (02:00:55) : The Steig paper is falsified. The entire AGW science is falsified.

    In many ways WUWT acts like the press release office at a university: it gives prominence to your work outside of a technical audience, but it has a bad habit of exaggerating the conclusions at times.

  25. Jonathan said

    Jeff If, #21, actually most of the interesting bits of quantum physics are just linear algebra 🙂

  26. mack520 said

    Jeff Id (20) — The internet is best effort, but without guarantees. Sometimes comments are simply lost. Yes- that would explain it (re Ryan O)

  27. mack520 said

    Sorry- the above is quoting David Benson’s comment 24 on RC
    “Jeff Id (20) — The internet is best effort, but without guarantees. Sometimes comments are simply lost.”

  28. Jeff Id said

    #22, Ryan’s answer aside, I believe there is quite a bit written on the topic. The data is known to have some significant problems.

    My own point is that the short term variance is so great that long term signal corrections as the one Ryan used have no real effect on verification statistics.

    From JeffC

    From the NSIDC data repository on AVHRR

    Product validation is a continuing process that takes advantage of comparative data as they become available. Comparisons were made between AVHRR Polar Pathfinder clear sky skin temperatures and surface-based measurements obtained at the South Pole over a seven-day period in 1995. These field data were collected by Robert Stone of the Cooperative Institute for Research in Environmental Sciences (CIRES) using a sled-mounted KT-19 pyrometer. Excluding observations when cloud cover was present, the agreement was generally within 0.5 kelvin. For data averaged over a four-hour period, temperatures were within 0.1 kelvin. A mean of -38.15 degrees Celsius for the AVHRR Polar Pathfinder observations, versus a mean of -38.25 degrees Celsius for the field data. Evaluations were also performed for the AVHRR Polar Pathfinder retrievals of surface albedo over the Greenland Ice Sheet through comparisons with albedo measured at 14 Automatic Weather Stations (AWS) around the Greenland Ice Sheet from January 1997 to August 1998. Results show that AVHRR-derived surface albedo values are, on average, 10 percent less than those measured by the AWS stations. However, station measurements tend to be positively biased by about four percent, and the differences in absolute albedo may be less, about six percent. In regions of Greenland where the albedo variability is small, such as the dry snow facies, the AVHRR albedo uncertainty exceeds the natural variability. Stroeve concluded that while further work is needed to improve the absolute accuracy of the AVHRR-derived surface albedo, the data provide temporally and spatially consistent estimates of the Greenland Ice Sheet albedo (Stroeve et al. 2001) and (Stroeve 2002).

    Analyses of the AVHRR Polar Pathfinder data, compared with data from the Surface Heat Budget of the Arctic Ocean (SHEBA) project, are in progress. See (Maslanik et al. 2000) for preliminary results. The cloud masking process was assessed and refined throughout the duration of the project to optimize the algorithm for the entire areas of coverage. Comparisons of areally-averaged cloud fractions from the AVHRR Polar Pathfinder Twice-daily 5 km EASE-Grid Composites with field observations at the SHEBA field site show that the AVHRR data were within nine percent of the cloud lidar/radar observations averaged from April to July 1998 with Pathfinder data underestimating cloud fraction relative to the field measurements. Differences in monthly means for this period ranged from 2 percent in June to 21 percent in July. Comparison of all-sky skin temperature and albedo values derived from the AVHRR Polar Pathfinder Twice-daily 5 km EASE-Grid Composites with SHEBA observations is described in (Maslanik et al. 2000).

    Other validation studies of surface temperature and albedo retrieval procedures included surface observations from a NOAA research site near Barrow, Alaska, 71.32 degrees north latitude, 156.61 degrees west longitude. Daily AVHRR data from a preliminary Pathfinder data set from mid-1992 to mid-1993 were used for this validation (Meier et al. 1997). Surface temperature estimates agreed with observations, with a correlation coefficient of 0.98, a bias of -0.97, and a RMSE of 4.70. For surface albedo, the bias (mean error) in the estimates was near zero, r=0.81, bias=0.00, RMSE=0.17, but the individual observations exhibited significant variability, attributed to surface inhomogeneity and retrieval scheme sensitivity to changes in atmospheric aerosol and water vapor amounts. Accuracies of the products are difficult to determine given the limited nature of existing case studies. Also, conditions vary substantially across the large product domains and over time. Plans are being developed to further define product accuracies for snow-covered areas, sea ice, and ice sheets. Based on studies to date, accuracies in general are approximately ± 2 kelvin for AVHRR-derived clear sky skin temperatures and ± 0.06 kelvin for albedo. Much of this error is likely due to uncertainties in the performance of the cloud detection methods. For clear sky conditions, accuracies for albedo and temperature products are expected to be in the range noted in the Greenland Barrow case studies.

    The temps are +/- 2K – a pretty big variation considering the signal we’re looking for. This is the reason I believe the best method is to apply a strong correction to match surface station data and leave the short term covariance intact for RegEM. Ryan’s corrections were small in comparison.

  29. Ryan O said

    #28 Also, RegEM itself tells you that there’s something wrong with the satellite data, as the 1982-1994.5 period generates a 1957-2006 trend of ~0.06 Deg C/Decade and the 1994.5-2006 period generates a 1957-2006 trend of ~ 0.113 – or nearly double. The offset shows up plain as day in the split reconstruction test.

  30. Ryan O said

    Also, like Zeke said, I can see Steig objecting to some of the ways our analysis was characterized. After all, we do still show warming overall from 1957-2006. However, as both Jeffs and I have said, the main goal was to investigate the validity of the method. From that perspective, yes, I believe we have called it into serious question. Not only that, but the geographic location of the trends is rather important.

    But even assuming that Steig’s reconstruction gave the “right” answer, there are serious defects in the methodology.

    As has been said about another issue in the recent past . . . wrong method + right answer = bad science.

  31. wmanny said

    Attempted to post the following at 4:13 PM. Did not pass moderation, nor did I expect it to:


    “If you have something coherent and useful to say, say it in a peer reviewed paper.”

    Why the double standard? At the behest of colleagues, you have posted a response to those you claim are guilty of overfitting. Your response is noted, and not surprisingly, they write back. And they are told to produce a peer reviewed paper? Why not simply continue to engage in a discussion that you have joined? Real Climate is not a peer reviewed blog.

    Otherwise stated, if you have much better things to do than what you are here doing, why do it?

    Walter Manny

  32. Jeff Id said


    Is your comment still in moderation?

  33. Ryan O said

    Nope. Hahahahaha. 🙂

  34. Jeff Id said

    #33 Damn that’s good stuff. These guy’s need a press adviser.

    I’m going to work on a detailed reply, cause bloggin’ is fun. – too bad you don’t have email.

    They try to make it sound like we don’t get the argument. We’ve replicated the math, analyzed the statistics, improved on the result, done it different ways and still can’t even reach the point of discussion.

    It’s getting pretty weak over there RC.

  35. Plimple said

    Ryan O,

    “In Steig’s case, he is claiming that we are adjusting the satellite data in order to achieve our desired result.”

    Is he? He certainly indicates that you adjusted the satellite data and your own analysis steps indicate this. He certainly indicates that this leads to a particular result. He certainly questions the motives of individuals connected with this blogosphere effort based on various statements made across various blogs suggesting the sole purpose of Steig’s method was to yield a particular result. What I do not see is an accusation directed at yourself regarding motives and how this might affect your own methodological decisions. And he certainly does make a statement linking your calibration methodology and your own motives.

    Afterall, how is he to distinguish the polite person who has been emailing him regarding his methods (your own statements indicate that this dialogue was fruitful and polite) from the mass of posters on WUWT, here and elsewhere, pointing fingers, questioning motives and being generally rude?

    In summary, from what I’ve seen, you’ve conducted yourself in a polite manner. Steig, can be polite, we know that. He is being condescending now perhaps. Certain comments from individuals could have annoyed him, so perhaps he is projecting annoyance about those issues onto the dialogue with you. Certainly your close association with Jeff Id, who has made several comments questioning motives and insinuating conspiracies could be confusing him.

    “There is a valid criticism of my ad hoc adjustment, however, which is why I ran the reconstruction both ways.”

    I agree, do you think it is valid? Can you support its validity? You say you’ve done the reconstruction both ways, do you have a link to your analysis?

    “It is my contention that they are not measuring the same thing.”

    I agree, isn’t that obvious? AVHRR retrieves surface temperature not surface air temperature. Where does Steig et al. explicitly state that they are measuring the same thing? Is Steig et al. sensitive to that assumption? Or is sufficient to assume that they correlate well?

  36. Plimple said

    Ryan O,

    Can I check, are you using the Pathfinder AVHRR dataset?

    Oh, and damn!

    “And he certainly does make a statement linking your calibration methodology and your own motives.”

    Should read:

    And he certainly does NOT make a statement linking your calibration methodology and your own motives.

  37. neill said

    perhaps our intrepid cyberspace ringside reporter is a bit over the top here:

    ….and the current titleholder Steig bolts from his corner at the start of the round, launching a massive right-hand shot to the body of the challenger Ryan O. Yet the effort causes Steig to drop his left slightly, and the challenger capitalizes on the opening with a lighting right cross flush to Steig’s jaw, staggering the champ…..

    The crowd is on its feet, the roar deafening.

    Suddenly, Steig unsteadily backs away, begins to untie his boxing gloves, then pulls them off. The gloves drop to the canvas. Just as suddenly, you can hear a pin drop in the cavernous arena.

    Steig says, “I’m not at all interested in debating you — I’ve got much better things to do. Let me be very clear, though, that I’m by no means claiming our results are the last word, and can’t be improved upon. If you have something coherent and useful to say, say it in a peer reviewed paper. If your results improve upon ours, great, that will be a useful contribution.”

    A cascade of boos descends upon the ring along with some rotten fruit, which Steig ducks to avoid. He glares at the challenger, then sniffs. Steig hurries to slip through the ropes at ringside, hops down and trots quickly out of the arena.

    His gloves remain in the ring, a forlorn target for the crowd’s growing fury.

  38. Ryan O said

    #35 From Steig’s post:

    “It appears that what has been done is first to adjust the satellite data so that it better matches the ground data, and then to do the reconstruction calculations. This doesn’t make any sense: it amounts to pre-optimizing the validation data (which are supposed to be independent), violating the very point of ‘verification’ altogether.”

    In other words, he’s claiming that the reason our verification stats are better is because we deliberately made it that way by matching the satellite and ground data. Hence, we achieved the desired result simply by offsetting the satellites. However, he did not calculate how much this actually contributes to improvement in verification (entirely negligible) and instead simply declares that the improvement is due this offsetting.

    Note that I said there was a valid criticism of my calibration. Steig’s criticism ain’t it.


  39. Ryan O said

    Hey, Jeff, my comment now appears again in the “in moderation” queue. Not sure why it disappeared for awhile . . . could simply have been something on my end.

  40. Plimple said

    “However, he did not calculate how much this actually contributes to improvement in verification (entirely negligible) and instead simply declares that the improvement is due this offsetting.”

    Given this, and given Steig’s statistics on RC, there seems to be an inconsistency between your result and his….assuming of course that there are no other differences in methodology, are there? Do you agree?

    Apologies, I don’t have the time to go trawling through the comments at CA. Can’t you just explain, or give a specific post #?

  41. Jeff Id said

    #35 Plimple,

    I don’t mind the discussion but I have to point out that the reason Steig and RC won’t let me post is unrelated to this paper. It is due to the hockey stick M08 using CPS methods. This method is astoundingly fraudulent in my opinion, and there is no other word for it! I will not apologize for that because it ain’t my fault it’s Mann et al for publishing and the peers for signing off on it. At the time I figured it out, I didn’t know who RC even was :).

    I have not insinuated fraud in the Antarctic paper, I have questioned why, of all the possible answers for RegEM did they choose the one which generates the highest trend. The odds are substantial that they tried other values in RegEM as Steig has recently done with some kind of weird PCA analysis. This is a valid question. The claim that I have insinuated conspiracies is an unwelcome ad-hom.

    Like the Dhog, if you have something specific, put it up so I can apologize, otherwise keep this [snip] to yourself or put it up on RC where people will believe you.

  42. Ryan O said

    #40 I do not know what Steig is referring to with his verification statistics. He did not provide a description either of how he obtained his trends for his latest post, nor how he calculated verification statistics. The statistics I put up were calculated using exactly the same method as Steig described in his paper.

    Furthermore, if he simply increased the number of PCs and left regpar alone, he will not get the same results as we do. This is because though the PC selection determines the faithfulness of teh AVHRR representation, it is the number of retained eigenvectors in RegEM that determines the reconstruction. So if he merely increased the number of PCs and put them into the same meatgrinder, his results will diverge from ours. Indeed, we spent several posts explaining why this happens (see “RegEM: The Magic Teleconnection Machine”) and why regpar should be set higher than 3.

    Lastly, part of the “Antarctic Coup de Grace” dealt directly with a methodological issue with Steig’s paper. Steig imputes all of the PCs and ground stations together simultaneously; we do not.

    For the “reasons why the offsets don’t affect the result”, CA seems to be down for the moment. It was post #10 or something in there. But basically the offsets I applied were small enough that they do not affect the verification statistics. There can be no accusation of “tuning” the reconstruction to achieve good statistics because the statistics and results are the same whether the tuning is performed or not.

    If I have time later, I will post the plots of all of this. It will take a bit of time, but won’t be a problem. 😉

  43. Ryan O said

    Jeff, hey, I don’t think it’s my original post. The new one in the moderation queue is #30, and my old one was like #5 or so.

    Also, except for the bolding, all of my italics are missing. Methinks they may have originally snipped it, but then resurrected it. 🙂

  44. John M said


    Good discussion. Maybe to keep it going without waiting for CA to come back up, I think the comment on CA Ryan is referring to is #9, which was captured by the Google cache.


  45. I even posted a gentle reminder that they should let your post through moderation, but my post appeared just after yours did! Oops.

  46. Ryan O said

    #45 I saw that and had a chuckle. Hehe! 🙂 Working on the plots for the verification stats right now. I’ll post them here, and then link to them at RC because last time I tried to post pictures on RC it didn’t work. Oh, and I’ll also post instructions on how to alter the code to perform the reconstructions and verification without calibration of the satellites. 😉

  47. Plimple said

    Ryan O,

    Thank you for your explanation. Thinking about it some more, I think Steig’s plots possibly are different due to different regpar choices, at least that seems more likely than anything else right now.

    I wasn’t accusing you of “tuning”, is Steig? Tuning implies some of direct and deliberate effort to get a specific result. Anyway as you indicate its moot. I was interested to both understand the sensitivity of the result to the input assumptions and the validity of those assumptions. You’ve convinced me, for the moment, that this assumption doesn’t affect the results. I’m not convinced it’s the best assumption, but there doesn’t seem to be an obvious and purely objective way to arrive at the best calibration assumption.

    Jeff Id,

    There’s a fairly open and plain conspiratorial discussion you’ve written about the motivations behind Steig’s paper and reasons for defending it on WUWT dated: 30 05 2009 Jeff Id (06:42:14). On ad homs. My comments are certainly an opinion about your comments, granted, but I have not used your comments as the basis of an argument against your work with Ryan O – that would be ad hom. You may not agree with my characterisation and opinions, but they are set aside from the context of the discussion about the work on Steig’s paper. You do seem to be able to post on RC now though.

  48. Ryan O said

    #47 Yes. He stated that the source of the improved verification statistics is due to the fitting of the satellite data to the ground data, which (if that was why I did it) could be considered “tuning”.

  49. Jeff Id said


    You know Plimple, you’re almost right. I let loose that morning, it was the third or fourth time I had explained the clear difference between our reconstructions. ( my results, Jeff C, Ryan’s results and Steig’s).

    The warming Antarctic is apparently required for validation of models and RC has a post last year which said the Antarctic is cooling in accordance with models and now this year they have a post which says it’s warming in accordance with models. I can see where you would get the impression that it was a conspiracy claim, but it was basically frustration of nailing down a constantly moving target. If the goal shifts again you can bet the models will move.

    Consider though what modelers would do if the observations don’t meet with observation. Will they throw out the models as invalid, hell no. They have two choices, change the data or change the models.

    If the data is not in question you can bet the models will change – no conspiracy, just reality. So while my post was in an admittedly frustrated tone please don’t give me a tinfoil hat for it.

  50. Jeff C. said

    Oops-accidentally put this on the wrong thread at first, it belongs here…

    Ryan and Jeff – great job and thank for keeping the heat on after all these months. I’ve been following with interest and was happy to see Ryan’s post at WUWT. I think Anthony’s headline might have been a little strong and that might have twisted Dr. Steig’s tail a bit and clouded his response. I hope he takes the opportunity to reread the article and follow some of the links and doesn’t get hung up the satellite transition stuff. It’s an important point, but Ryan was clear it didn’t significantly influence the results.

    The amount of work dedicated to reconstructing his methodology is truly staggering (Jeff Id, Ryan O, Roman M, Steve Mc and lots of good comments), although the most praise should go to the wives for putting up with this for the last five months.

    In a comment above, Jeff linked to a post I did regarding satellite transitions. In a later post I described a better data set for determining cloud cover that I found at the University of Wisconsin website. In this data set, there are still troubling discontinuities in the satellite record, particularly from NOAA-14 to NOAA-16. I corresponded with those responsible for the data set at UWisc and they told me they were aware of the problem but did not know how to fix it. Their candor was admirable.

    That post shows that Dr. Comiso’s data set has a discontinuity at the same point. There are definitely real problems with the satellite transitions that should have been fixed or at a minimum disclosed. However, the real issue here is the reduction to 3 PCs for no good reason. The reasons given (spatial similarity to known atmospheric phenomena) are specious as shown by Steve M in the Chlandi posts. The sidetracking to the satellite transitions is a bit of a smokescreen thrown out by the team and its fans.

    I’ve been missing from the debate for the past six weeks as my five-year old son has been dealing with some long-term medical issues. He is now doing fantastic and has made an almost complete recovery from a neurological disorder. Jeff offered to let me post on this and I’ll probably right something up in the future. In the meantime, I look forward to getting back into the discussion.

  51. Jeff Id said

    Ryan, You got a response from Dr. Steig.

    I must be on the fellow blogger list.

  52. Jeff C. said

    Re: Jeff Id – from the other thread:

    “I think we could use some help with the cloud data if you get the time. Ryan has done some interesting corrections, however if a pre-determined basis was used for correction which brought trends in line it would be better. No idea if it can be done, after all Dr. Comiso did pretty well with what he had to work with.”

    Sure, I’d be happy to help. When I left off, I had found two troubling issues:

    1) Spacecraft transition discontinuities
    2) Areas with highest cloud cover had the most warming

    #1 is probably some correction improperly applied. #2 is really interesting as it could mean the cloud masking methodology is causing at least part of the reported warming. The correlation between % cloud cover and amount of warming is striking. I can’t think of a good reason to explain this unless the cloud cover changes increases over time. The record doesn’t show this.

    Send me an email when you get a chance and let me know what you need.

  53. Ryan O said

    I have finished putting together plots for the verification statistics for a reconstruction using uncalibrated AVHRR data. For reference, here are the statistics for Steig:

    Here are the statistics for the original 13-PC reconstruction using the calibrated AVHRR data:

    And here are the statistics for a 13-PC reconstruction using UNCALIBRATED AVHRR data:

    As you can see, the calibration of the AVHRR data has no effect on the verification statistics.

    To replicate this for yourself, find these lines in the “RECONSTRUCTIONS” segment:

    ### Perform a 13-PC reconstruction using calibrated AVHRR data
    ### with manned and AWS stations, and no ocean stations.

    dat=window(calc.anom(, start=c(1982), end=c(2006, 12))
    base=window(, form=clip), start=1957, end=c(2006, 12))

    Replace the bolded “” with “all.avhrr”. This substitutes the original, uncalibrated AVHRR data for the reconstruction.

    To calculate the resulting verification statistics, find these lines in the “VERIFICATION” segment:

    ### Calculate the early/late calibration statistics for the satellite period
    ### verification [Steig et al. (2009)] for the 13-PC reconstruction

    early.stats=get.stats.full(calc.anom(, r.13.early[[1]], 1982, c(1994, 6), c(1994, 7), c(2006, 12))
    late.stats=get.stats.full(calc.anom(, r.13.late[[1]], c(1994, 7), c(2006,12), 1982, c(1994, 6))

    Again replace the “” with “all.avhrr”.

    After that, simply run the entire script. It will calculate the complete set of verification statistics, including the ground station verification statistics. You can confirm for yourself that the calibration did not affect the verification at all.

  54. Chris G said

    An underlying issue in both analyses is model order selection. Implicit in the order selection methods mentioned is that the “noise” is normally-distributed. Has the normality presumption been tested, e.g., by comparing a robust scale estimator such as median absolute deviation with r.m.s. residual or by looking at what happens to the skew and kurtosis of the residuals as the number of PCs increases?

    FYI, I do a lot of hyperspectral data analysis. My experience is that non-normal (heavy-tailed) noise generally results in an overestimation of the number of significant PCs if you use the eigenvalues of the sample covariance matrix as the basis for model order selection.

  55. Ryan O said

    #54 The noise of the raw AVHRR data is best approximated by an AR(6) – AR(8) model with the first autocorrelation factor at ~0.3 or less, so broken stick or the bootstrapped eigenvalue methods will provide satisfactory results. The other thing to keep in mind is that the purpose of the PCA on the AVHRR data is merely data set reduction; it is not an attempt to isolate signals for analysis. As such, the only real danger is retaining too few PCs. We show this with the 21-PC reconstruction. I also ran reconstructions up to 30 PCs; the results do not change.

  56. I’ve been offline today – the fur’s been flying since I went out this morning. Wahl and Ammann purported to salvage MBH by adding in enough PCs until the results “converged”. Which meant adding in enough PCs to get the bristlecones in – even if they weren’t any good.

    It’s ironic that PC retention is a battleground issue in both MBH and Steig and they take opposite positions on the two studies.

    Another point to keep in mind is that Steig’s assertion that the Antarctica PCs are connected to physical causes is pretty clearly unfounded – a classic example of Buell’s “Castles in the clouds”, where Chladni patterns arising out of spatial autocorrelation are interpreted as having physical reality.

  57. Ryan O said

    #56 I had originally considered bringing that up in my initial reply, but felt that if I did so, it would increase the chances of it being snipped. It is, however, an issue that would be worthy of raising assuming we get published and depending on the response to the publication.

    Interestingly, Steig provided a link to RC’s document paraphrasing WA.

  58. Jeff Id said

    #56, I didn’t forget either. It’s not a small point and is a clear visual demonstration of the spatial limitations of 3 pc’s, the rest of the writing is just icing.

  59. Ryan has observed that Steig doesn’t seem to understand North’s separation argument – which is about something quite different than the sort of scree plot that he’s appealing too.

    An eigenvalue plot of spatially autocorrelated stations on a disk shaped like Antarctica has eigenvalues that decline very much like the Steig eigenvalues. That doesn’t mean that the first three Chladni modes are the only ones that are significant. Ironically PCs2 and 3 aren’t really “separable” in the North sense. They are a pair in a symmetric Chladni figure and they are pretty close in the somewhat irregular Antarctica. I don’t know what the uncertainties on the eigenvalues are, but I’d be amazed if there is a valid uncertainty interval that separates 2 and 3. Steig hasn’t cited any such calculation – he’s merely asserting it.

    The real question is – why do PC analysis at all. I haven’t seen any argument that supports the use of PCs as opposed to doing area-weighted averaging.

    I was mildly curious as to Steig’s statement that he’s teaching PCs. Steig’s resume shows that his background is in geology and geochemistry rather than math or statistics – something that will surprise few readers after reading his RC post.

  60. herbert stencil said

    I just tried to post the following comment over at RC, only to find that the comments have been closed after only 42 threads. Strange.

    Anyhow, here is my comment.

    “If I may be so bold as to weigh into this fascinating discussion. What hasn’t been mentioned yet are two issues related to the Steig et al paper that probably led to the detailed investigations by Ryan O, JeffID, JeffC, Steve McIntyre and others.

    First, the press releases that accompanied the Steig et al paper were pretty specific. “Much Of Antarctica Is Warming More Than Previously Thought” at for example, which clearly sought to present a particular viewpoint.

    Second, the authors of Steig et al clearly thought that it was unnecessary to provide the detailed backup information such as data used, the methods used to process it, and code.

    In effect, the combination of those two decisions led to the interest by those mentioned above. As JeffID and RyanO have explained, their interest, at least in part, was to understand exactly what had been done, and the reasons why the press release ‘conclusions’ should be given currency.

    Because they were attempting to re-create (replicate if you like) the Steig et al work without the benefit of fully disclosed data/processing methods/codes, they had to start from scratch, and that led them down some blind alleys.

    I respectfully suggest that there is a lesson here for those concerned about AGW. The decision to word the press release as it was, allied to the decision to not release detailed data/methods/code has led to this topic gaining much more widespread discussion across the blogosphere than would have happened had different decisions been made.”

  61. Jeff Id said

    I respectfully suggest that there is a lesson here for those concerned about AGW. The decision to word the press release as it was, allied to the decision to not release detailed data/methods/code has led to this topic gaining much more widespread discussion across the blogosphere than would have happened had different decisions been made.”

    This is certainly the case for me. Had Dr. Steig discussed the release of data and his code rather than ask me to take his matlab class, the entire tone of the thing would have gone differently – at least on my posts. I still would have found trends to be too high in the end but it would have taken about 1/10th the time.

    It’s too bad he closed the thread, I had a few more questions.

  62. Jeff Id said

    #59, I’m not sure why use PC’s either. I still have several ideas for area weighting surface stations based on sat covariance.

    What if the best covariance is used to pick the surface/agw station for each pixel. It’s a good way to choose a proper influence area for the data and would result in good correlation/distance plots.

  63. Layman Lurker said

    Wow. I just got back from work an hour ago and I missed all the fun. Glad to see that Steig engaged you guys in minimal dialogue at least. Too bad he closed the thread.

    I just skimmed the RC thread but it did not seem to deal much with Ryan’s point about the “overfitting” straw man in a data decomposition situation. A tad ironic when much of the real problem is too few PC’s and a cluster of correlated, warming stations from a tiny area.

  64. There is considerable statistical experience in area weighting in the calculation of ore reserves – kriging comes from ore reserve calculations. I cannot see a relevant distinction between Steig trying to recon Antarctica with 3 PCs and someone trying to do an ore reserve using principal components. Kriging has a relatively long history and there are many specialists who understand its properties. It uses covariance but obviously in a different way than RegEM-TTLS after PC preprocessing (PC=3, regpar=3], a methodology that to my knowledge, has never been previously used in that particular combination and the properties of which are poorly known to say the least.

  65. Fluffy Clouds (Tim L) said

    Ryan o

    Additionally, any time you splice two different measurement methods together, you must calibrate them. It is not an option. It is Steig’s burden of proof to show that splicing the 3 AVHRR PCs on the end of the ground data does not result in spurious trends. It is Steig’s burden of proof to show that the satellites and ground data are measuring identical things. It is Steig’s burden of proof to show that the satellite data can be used – as is – without any calibration to the ground data.

    NOW I GET IT! like trying to use Celsius and Fahrenheit

    I am a bit sadden that you two, jeff + ryan did not ask why use ’57 as a starting point? not ’40s or 67′ ?

    I did miss the furry flying but have to work one day a week! lol

    Thank You from every one that has no voice on this and if you need a bit of money to put this into publication I think we can all help.
    don’t be shy to ask. 😉
    when you publish use a date in the 40s.
    and keep track of rejections post them for all to see.

    can I kiss your feet now? 🙂 lol

  66. Re #65.
    The question about starting the analysis from 1957, was raised in the first RC article about Steig et al.
    Gavin’s answer was along the lines that that was the International Geophysics Year and when more of the weather stations were opened.
    It’s odd that they’ll let that fishermen rubbish ramble on for over 1,000 comments, but chop this short after a few dozen.

  67. Chris G said

    Follow-up to #54 and #55:

    1) My question in #54 was whether anyone has looked at what happens to the distribution of residuals (observation minus model output) as the number of PCs increases. PCA just accounts for variance in the data. It can be instructive to look at residuals to assess whether the variance is due disproportionately to a few outliers. One of the limitations of PCA is that its based on first- and second-order statistics. If there’s information in higher orders that could significantly affect the interpretation of results obtained using PCA.

    2) A good “reasonableness” check on your model order selection method would be to formulate and evaluate a likelihood ratio test, i.e., define eqns for the probability of observing the data (or residuals) given n PCs and (n+1) PCs and calculate the ratio of Prob(data|n) to Prob(data|n+1). Its a good way of determining the significance that the reduction in residuals by addition of the (n+1)th PC is the result of accounting for a “signal source” as opposed to a just a reduction in random noise you’d get by losing a degree of freedom.

  68. Jonathan Baxter said

    Ryan O, you get this question a lot I know, but are you planning to publish your results? It would be very worthwhile, not least of all because it would remove Steig’s excuse for not addressing the critical issues.

    [Aside: I don’t understood why Steig et al feel that science progresses faster through the glacial give-and-take of peer-review and counter-peer-review. Yes, you have to publish for the archival record and to ensure no glaring errors or omissions. And I agree there is no point in debating crackpots. But if someone such as yourself turns up willing to do the hard work and to discuss the issues in real-time, that’s gold!]

  69. Jonathan Baxter said

    oops: s/understood/understand/

  70. Kenneth Fritsch said

    I am thinking that the recent “discussion” at RC with Dr. Steig on the retention of PCs is mainly of value in determining what Steig’s arguments say about his understanding of the methods that he applied in Steig et al. (2009). I’ll be honest that in my judgment he attempted to assert and not show his case for overfitting. He also did not appear to understand the significance of Ryan O’s sensitivity analysis or that the analysis was a sensitivity test.

    I would hope that Ryan O would provide us at this thread with a review of the case made by Steig for over fitting and of Steig’s interpretation of what Ryan O did. I am certain that Ryan could accomplish this in a manner most objective and without invoking personalities.

    In exchanges like that with Steig, I would not expect a scientist of his acclaim to concede any errors or omissions (in fact that is not how it is handled in peer review where errors and omissions just fade away after a while with the publications of counter evidence) and therefore all we have is a better insight into his understanding of the issues at hand.

  71. Kenneth Fritsch said

    I am a bit sadden that you two, jeff + ryan did not ask why use ‘57 as a starting point? not ’40s or 67′ ?

    Since I am frequently one to bring these starting dates up, I think I can say that Steig could easily point to a 1957 start data as being the start of surface temperature measurements.

    My problem is that no direct references are made in the Steig paper concerning the trends and statistical significance of those trends being different than zero when coming forward from 1957 and the Steig ice core work that showed a warmest decade for West Antarctica in the 1935-1945 decade.

    In other words, a comment on the sensitivity of temperature trends for overall Antarctica and the three regions of Antarctica to start date was very much in order in my view.

  72. Kenneth Fritsch said

    I was mildly curious as to Steig’s statement that he’s teaching PCs. Steig’s resume shows that his background is in geology and geochemistry rather than math or statistics – something that will surprise few readers after reading his RC post.

    I was attempting to make sense of this comment also and came up with the proposition that Steig did an off-hand comment to his class (not a statistic course) about his defense of Steig et al. and his over fitting case for PC retention. He was then able to use that action in his introduction to his RC thread that was transparently in condescension mode.

    I wonder if any of his students asked for evidence for his assertions.

  73. Plimple said

    Jeff Id,

    You talk about the apparent shifting of models. To help the discussion, could you be more specific? i.e. which models/papers say one thing and which papers/models say another?

    I am aware of different modelling approaches that have been taken that yield some different results. For instance, in Steig’s RC post he cites a modelling paper by Goosse et al., 2009 whose methodology differs substantially from other more traditional GCM based approaches.

    As Steig points out, natural variability is believed to have played a significant role in climate trends over the Antarctic and to some extent stratospheric ozone depletion should have altered the radiative balance over the continent. Based on this, I would expect different model results from models that deal with natural variability and the issues of changes in stratospheric ozone in different ways.

    Also, I’m not sure that its fair to say that model goal posts have shifted in cases where model/observation discrepancies have been identified. In the case of tropical troposphere trends the model ensembles haven’t really shifted their prediction of what we’d expect from greenhouse gas induced TT warming.

    In summary, which models predict what? Which models compare well with or disagree with Steig’s analysis or Ryan O’s?

    Thank you for clarifying your position. I note that you condemned accusations of fraud on numerous occasions on WUWT and have criticised the potentially over the top pronouncements by Anthony.

  74. Shane Hayes said

    The Catholic Church has an excellent business process when it comes to creating new saints. The criteria to become a saint is clear. You have to be a) dead and b) have been responsible for 2 miracles. Naturally, when someone is in the running there is a large cheerleading cohort who are inclined to believe any report of a miracle. In order to guard against this, a senior cleric is appointed to find reasons not to make a saint out of the person.

    The title this senior cleric is assigned is – The Devils Advocate.

    What I have concluded from lurking on blogs such as this is that the internet has enabled and created a higher standard than peer review for verifying new science.

    That higher standard is epitomised by the work done by Ryan O, the two Jeff’s, Steve M and others on this case. This higher standard needs a name.

    How about Devil’s Advocate Review


  75. Jeff Id said

    #73, there are links in the threads of RC leading to head posts back in early 08 which discuss that the Antarctic is cooling ‘in accordance with models’ the latest is warming ‘in accordance with models’.

    There are too many different models to nail down which agree and which don’t, I know the ones presented in Steig et al are better at matching warming trend than cooling. One was presented on yesterdays Steig thread. The reality is that despite some of the wilder claims I’ve read, models are designed to match observation. If they don’t they are rethought until they do.

    If the modelers are comfortable with the incorrect result of a warming antarctic trend in the last 40 years, the models showing the incorrect warming will not be adjusted. The extra ocean current won’t be discovered or the lower feedback won’t be realized.

    So since a simple analysis like this below shows 40 years of cooling and models show warming, AGW needs to change the models or – fix (repair) the data.

  76. Terry said

    Dhogaza: And, yes, when choosing 13 PCs overfitting is the first question that comes to mind.

    Oh my.

    Dhogaza:…so didn’t bother to read this latest work by you and RyanO.

    Probably just as well – I don’t think it would have made a lick of sense to you. You may want to start out with a good read about lossy compression, and work forward from there. Crawl, walk, run, read, comprehend, comment. Just a thought.

    Have a nice day!

  77. naught101 said

    Much respect to Ryan O. Your manner is beautiful, and a real breath of fresh air in the somewhat climate-blog-o-sphere.

  78. naught101 said

    somewhat stale*

  79. John Ritson said

    I missed the chance to ask Dr Steig the following question because he had closed comments before I got around to reading his post. So I’ll comment here just in case he reads this site and hope for the best.

    How does he reconcile his statement:
    “The basic lesson here is that one should avoid using more parameters than necessary when fitting a function (or functions) to a set of data. Doing otherwise, more often than not, leads to large extrapolation errors.

    In using PCA for statistical climate reconstruction, avoiding overfitting means — in particular — carefully determining the right number of PCs to retain.”

    against Dr Scmidt’s statement in “Dummies guide to the hockey stick controversy”:
    “How can you tell whether you have included enough PCs?
    This is rather easy to tell. If your answer depends on the number of PCs included, then
    you haven’t included enough.”

    One statement warns against using too many PCs because extraneous ones cause over fitting the next says adding more PCs wont affect your answer. Well, which is it?

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: