the Air Vent

Because the world needs another opinion

MW10 – Some thoughts

Posted by Jeff Id on August 18, 2010

Most of you probably were wondering if I would comment on this recent paper/book.  AOAS1001-014R2A0



In my opinion it is a landmark paper in its efforts to quantify the uncertainty in the proxies.  While this paper appears to be about paleo cliamte reconstructions, the limitations of reconstructions it re-exposes so dramatically actually point directly to models. I don’t claim to have figured the whole thing out, it isn’t without its flaws.  However, the work of these authors was more than extensive with an excellent grasp of statistical prediction and quality of the raw data.  In my case, I’m very lucky to have already put in the groundwork with the Mann08 data, it made the paper very easy to read.  At the beggining of the paper the authors, in an almost blog like fashion, took time to frame the impetus behind the work.

This effort to reconstruct our planet’s climate history has become linked
to the topic of Anthropogenic Global Warming (AGW). On the one hand,
this is peculiar since paleoclimatological reconstructions can provide evidence
only for the detection of AGW and even then they constitute only
one such source of evidence. The principal sources of evidence for the detection
of global warming and in particular the attribution of it to anthropogenic
factors come from basic science as well as General Circulation
Models (GCMs) that have been fit to data accumulated during the instrumental
period (IPCC, 2007). These models show that carbon dioxide, when
released into the atmosphere in sufficient concentration, can force temperature

On the other hand, the effort of world governments to pass legislation to
cut carbon to pre-industrial levels cannot proceed without the consent of
the governed and historical reconstructions from paleoclimatological models
have indeed proven persuasive and effective at winning the hearts and
minds of the populace. Consider Figure 1 which was featured prominently
in the Intergovernmental Panel on Climate Change report (IPCC, 2001) in
the summary for policy makers1. The sharp upward slope of the graph in
the late 20th century is visually striking, easy to comprehend, and likely to
The IPCC report goes even further:

Uncertainties increase in more distant times and are always much larger than
in the instrumental record due to the use of relatively sparse proxy data. Nevertheless
the rate and duration of warming of the 20th century has been much
greater than in any of the previous nine centuries. Similarly, it is likely that
the 1990s have been the warmest decade and 1998 the warmest year of the
[Emphasis added]

It’s so true,  Mann wouldn’t have become famous if the hockey stick had no meaning (as I’m sure he’s quietly wishing), or if the result weren’t so shocking in appearance. If you’re new to the discussion, when hockey sticks have been discredited, the argument by climate science™ usuall shifts to – it didn’t matter anyway because of all the other evidene.  In reality, they do matter.  They matter for model hindcasts which are the entire basis for the future projections.

The paper concludes in part:

On the one hand, we conclude unequivocally that the evidence for a
”long-handled” hockey stick (where the shaft of the hockey stick extends
to the year 1000 AD) is lacking in the data. The fundamental problem is
that there is a limited amount of proxy data which dates back to 1000 AD;
what is available is weakly predictive of global annual temperature. Our
backcasting methods, which track quite closely the methods applied most
recently in Mann (2008) to the same data, are unable to catch the sharp run
up in temperatures recorded in the 1990s, even in-sample. As can be seen
in Figure 15, our estimate of the run up in temperature in the 1990s has
a much smaller slope than the actual temperature series. Furthermore, the
lower frame of Figure 18 clearly reveals that the proxy model is not at all
able to track the high gradient segment. Consequently, the long flat handle
of the hockey stick is best understood to be a feature of regression and less
a reflection of our knowledge of the truth.
Nevertheless, the temperatures
of the last few decades have been relatively warm compared to many of the
thousand year temperature curves sampled from the posterior distribution
of our model.

It’s about as damming a description of the paleo branch of climate science™ as you could ask for.   There are numerous little daggers hidden in the text too.

As an initial test, we compare the holdout RMSE using the proxies to two
simple models which only make use of temperature data, the in-sample
mean and ARMA models. First, the proxy model and the in-sample mean
seem to perform fairly similarly, with the proxy-based model beating the
sample mean on only 57% of holdout blocks. A possible reason the sample
mean performs comparably well is that the instrumental temperature
record has a great deal of annual variation which is apparently uncaptured
by the proxy record.
In such settings, a biased low variance predictor (such
as the in-sample mean) can often have a lower out-of-sample RMSE than a
less biased but more variable predictor. Finally, we observe that the performance
on different validation blocks

Considering that the majority of the Mann08 proxy record is trees, it’s an interesting point that warmer years aren’t individually captured in the tree rings.  What exactly prevents trees from annually reacting to temperature, is another mystery?!!  By interesting, it’s a point that drives me crazy–  Haha.

Correlation is not a physical process, unfortunately this paper suffers a bit from the combination of stats and reality.  No this isn’t time for ‘correlation isn’t causation’ but it is a time to consider the non-natural comparison of datasets that correlation represents.  It’s rather entertaining to see people write with excitement that two unrelated positive trends have a correlation greater than 0.1. For instance, global economic output and the ever improving blogging experience at the Air Vent.   Why it’s surprising is where the topic leaves me a bit dumbfounded and concerned that perhaps the authors didn’t realize the extent of the infilling – hockeystickization – of the Mann08 proxy data used in this paper.

What I mean is, when one regression setting is confirmed by the good correlation of  unused data against the used data, something improved seems to be going on.  However, when the data has a general upslope – substantially pasted on by RegEM – who should be surprised by a bit of correlation!!?   What’s more is that the regression performed, seems to me to be another hockey-stick-seeking missile.  The good news is that their hockey stick was verified or in this case shown to be unverified, using actual statistics.

The only thing which really bothered me was the complete ignoring of variance loss created by their methods as well as others.  I mean they apparently completely miss the point that the methods for extracting a signal from the incredibly noisy data based on a shortened calibration period, will preferentially select autocorrelated noise.  For the math enhanced reader-autocorrelated is defined here as meaning anything with a temporally persistent signal or for those of us who don’t get lost in math nuance — “noise with a trend”.

They really ignored it too:

Alternatively, the number
of proxies can be lowered through a threshold screening process (Mann
et al., 2008) whereby each proxy sequence is correlated with its closest local
temperature series and only those proxies whose correlation exceeds a
given threshold are retained for model building. This is a reasonable approach,
but, for it to offer serious protection from overfitting the temperature
sequence, it is necessary to detect ”spurious correlations”.

As far as the ‘reasonable approach’ all I can say is pure bovine scatology and hand waiving,  while it is friendly hand waiving to those who are otherwise critiqued, it is still hand waving. It’s like all these guys went to the same school of chuck it if you don’t like it!!  I’ll take my lousy state school education over this kind of thing any day. .  All because of the concept that you might be able to detect ‘spurious’ correlations.  My god that misses the point that correlations are a mathematical artifact, and not a reality detector.

– Sorting a lot of long timeseries by correlation inside a short series  causes guaranteed variance loss in the rest.

– Scaling by least squares fit of long timeseries by correlation inside a short series  causes guaranteed variance loss in the rest.

– Regression of long timeseries by correlation inside a short series  causes guaranteed variance loss in the rest.

It’s all the same thing, and it’s still not understood.  So, why all the noise Jeff?  After all, the paper did completely prove that the proxy reconstructions aren’t doing their job right?

Well yeah, but look at this plot:

Now I cannot explain the continuous rise of the pre-calibration handle of the hockey stick, but these authors have done absolutely nothing I can see to address why the blade/handle relationship is guaranteed by the math.  I don’t claim to have a high quality understanding of the lasso method yet though, but the variance loss which will happen is undiscussed.

Anyway, my impression of the paper is that it has a lot of appropriately critical wording inside, but it also suffers from a bad reconstruction which was not appropriately criticized within the paper.  They had to start somewhere though, but it seems to me that while citing VonStorch and Sorita 04, they missed the crux of the argument.  Maybe I’m wrong.

A key issue which was a positive in the paper was the quality of the signal in the proxies:

Hence, the real proxies–if they contain linear signal on temperatures–should outperform our pseudo-proxies, at least with high probability.

Which is later demonstrated in table 1 to not be the case.

In general, the pseudo-proxies are selected about as often as the true proxies. That is, the Lasso does not find that the true proxies have substantially more signal than the pseudoproxies.

Which is a conclusion that I’ve come to here over the past two years, there really isn’t much signal in these proxies.  Not enough to separate fake proxies with similar autocorrelation properties and no signal from temperature proxies.

I think from my recent work here on Mann07 though, we might be able to make an engineering style estimate of the true signal contained in the Mann08 proxies.  That is something I haven’t read in the literature (not that it isn’t there) but since we have climate model data with the ability to add similar autocorrelation to that found in proxies, we can make an estimate.  This estimate will likely be the subject of my next post.

Final thoughts:

Good paper, good conclusions containing what is still an ugly reconstruction method.  Like the MMH10 model paper, this will need to be addressed by the community rather than ignored.  These statistically correct critiques of paleoclimate are becoming more common and will continue until the issue is properly addressed by the consensus community.  If the proxies don’t contain enough signal to be better predictors of temperature than sophisticated noise, models which use reconstructions to verify the accuracy of hind-casts,  have got nothing to be verified against!

Or as Brigg’s quotes from MW10:

Climate scientists have greatly underestimated the uncertainty of proxy-based reconstructions and hence have been overconfident in their models.

32 Responses to “MW10 – Some thoughts”

  1. AMac said

    Jeff, you observe AGW Consensus advocates and scientists rebutting problems with hockey-stick paleoreconstructions by saying “it doesn’t matter.”

    > “‘It didn’t matter’ anyway because of all the other evidence. In reality, they do matter. They matter for model hindcasts which are the entire basis for the future projections.”

    This strikes me as about right, from my worm’s eye view of l’affaire Tiljander.

    One interesting tidbit I (belatedly) discovered this week is that Mann08’s authors performed a log-transform of all four of the Tiljander data series before using them. “Not that there’s anything wrong with that” … but why?

    I don’t recall any discussion of the point. To my knowledge, there is no physical basis for believing that the temp-varve relationships are exponential in nature. I don’t think anyone has claimed a log-normal distribution for any of the data sets.

    It may be that Mann08’s authors are so used to looking at data sets as strings of numbers, that they don’t much think of them as actual measurements of particular physical quantities. Since a proxy like Lightsum or Darksum cannot have a linear relationship to temperature–look at the graphs–why not log transform them as a first step?

    If you look at the Tiljander proxies as millimeters of sediment, you see right away that

    Darksum = Thickness – Lightsum

    That’s because Thickness is directly measured (e.g. with a caliper), and Lightsum is a measured quantity, while Darksum is imputed by the above equation. The graph is here.

    Log transform, and the relationship becomes non-obvious.

    Mann08 screened, validated, and calibrated Thickness, Lightsum, and Darksum as three independent proxies for temperature, with the same orientation. Think about that!

    It’s a small point, but possibly a revealing one as to attitude. How much time did any of the authors take to chart the Tiljander data and get a sense of its characteristics? Not very much, I suspect. They seem to be in love with the idea of discovering candidate data series and tossing them into the proxyhopper, in the faith that MatLab will turn straw into gold.

    Sorry, this is off-topic with respect to the narrow issue of MW10, as they specifically exclude questions of proxy variability. I bring it up because I think it may relate more broadly to the kind of attitude that you discuss in the post.

  2. Jeff Id said

    #1 I read your blogpost and forgot to go back and comment. I think the log originated from the original tiljander paper. It’s probably a standard swag in sediment proxy science – but I’m pretty certain it is just a swag.

  3. steveta_uk said

    it’s an interesting point that warmer years aren’t individually captured in the tree rings.

    Wood is quite a good insulator – perhaps it takes several years for the inner regions of the tree to notice the temperature change.

    Perhaps it will equally take several years for this all sink in with climate science™

  4. BobN said

    Amac and Jeff – It is my experience looking at lots of environmental data (e.g., groundwater contamination, natural distribution of elements in the environment, river flow data) that many such data are better describer as log-normal than normal distributions. So it may be the case with varves.

  5. Kenneth Fritsch said

    I think that perhaps we are seeing the problems of criticizing the many differently originating problems with the Mann methodologies and selection processes. When there are so many different sources of problems it is difficult to account for all of them in a single blog thread or even more difficult to cover all points in a published paper. It then appears that the author(s) are missing some important points.

    Certainly MW 2010 points to the selection and quality of the data as being potentially problematic with the reconstructions. They ignore it openly and since they are publishing in a statistical journal are more interested in the statistics than the prior selection criterion that are more climate science oriented.

    They quote from the von Storch and Zorita papers and thus must be aware of the variance problems with reconstructions and they present alternative models which apparently do not show much variation, so it is a bit puzzling why they did not at least comment on it. Perhaps the paper would have been editorially considered too long.
    I think the major point that MW show in their paper is the failure of models to get the run-up in temperatures to the 1990s right. I am not a statistician but I do not think that the Lasso method would somehow miss that run-up. If nothing else the MW models provide a sensitivity test for the reconstructions and the end result being that those reconstructions are not “robust”.

    The MW paper should sensitize or re-sensitize us to “the hide the decline” and the team practice of tacking the instrumental record onto the end of the reconstruction. Take another look at the reconstructions for the later part the instrumental period as evidenced in the Mann 08 paper and SI

    Click to access MannetalPNAS08.pdf

  6. Doug Proctor said

    BobN suggested “varves” may be better described by log-normal than normal distributions from his work.

    Varve thickness is controlled by a two variables:
    1. effective runoff time length, and
    2. sediment load.

    The effective runnoff time, i.e. the length of time sediment-carrying waters entered the catchment basin, is itself a function of temperature during the melting or rainfall period of the year, AND the rate of discharge. Below a locally critical rate virtually no sediment will enter the basin even though the streams are running and it is warm. The sediment load is controlled by source area and discharge rate, both affected by temperture, plant growth and precipitation. Bare, cold, dry areas of dirt will make periodic muddy streams when warm, plant covered areas will lead to clear streams. Or warm and dry areas can give periodic muddy streams, and so on.

    The variables show that there is no unique solution to varve analysis. At the same time, each of those variables is definitely not linear. Temperature and precipitation in the watershed are clearly cyclical but there are step-functions for both. Strong events are periodic but also neither random nor predictable. We are dealing with weather, not climate in the study of varves (similarly tideal cycles have strong weather signatures on top of the lunar cycles).

    You see periodicity in varve changes, which over time is climate. Linearity is not to be expected, but if it occurred would be a nice indication that only one variable was being changed. It would then be up to other lines of thinking to figure out which one it was.

    Engineers find geology crazy-making. All data is soft, and few problems have single solutions.

  7. Don Keiller said

    Steveta_UK. One of the interesting things about tree ring is that each ring is produced by the annual growth response, particularly marked in temperate climates where there is marked seasonality.
    Growth is the integral sum of the various physiological responses of the tree to its local microenvironment.
    Key factors influencing the growth response include;
    water availability,
    light availability,
    macro and micro nutient availability
    pest and disease activity
    Atmospheric CO2 concentration
    Cosmic rays

    A “good” growth year and hence a wide annual growth ring will occur when none of these factors limits.
    If one limits then growth and ring size diminish.
    There is no way of telling which factor limited unless you were there making the necessary maesurements at the time the growth ring was laid down.
    An added complication is that growth responses are non-linear. Temperature, in particular shows a distinct optimum.
    Growth slows above and below that optimum. Thus unusually cold and hot years will both cause ring width to decrease.

    Hence any environmnental signal from such rings is very noisy- too noisy IMHO to derive a temperature signal from.

    Much better are tree lines (altitudinal and latitudinal) which are temperature limited. Tree need a certain number of “degree days” (days with temperatures above a certain miminum) to become established. If you have a few decades of warmer weather tree lines move upwards, when the climate cools the tree die and become a sub-fossil treeline above (in terms of altitude/latitude) the existing treeline.
    Tree rings do have a valuable role here they can be used to very accurately date when these trelines were established and when they died.

    Unfortunaely they do not provide the “right” anaswer so papers about them (eg see tend to be ignored by the climate science community and the media.



  8. AMac said

    Re: BobN (Aug 18 11:31),

    I take your point, it’s a fair one and I stand corrected (trying to learn this stuff). Also thanks Doug Proctor for the varve primer.

    What strikes me as clear-cut is that Darksum, Lightsum, and Thickness cannot all contain independent information on paleotemperature. In particular, consider the case where, following Mann08, you treat all three as though higher values correlate to higher temperature.

    Let’s start by stipulating Lightsum has some such signal. So let’s add it to the proxyhopper.

    (1) Darksum might have additional signal, so let’s use that, too. But then Thickness does not: it’s simply Lightsum plus Darksum.

    (2) Darksum might not have additional signal. In that case, let’s not use it. But then we shouldn’t use Thickness: it is still Lightsum plus Darksum.

    For case (1) and for case (2) — Lightsum, Darksum, and Thickness are not three proxies with independent signal to contribute to the reconstruction.

    My perspective is that Mann08’s authors could have figured this out, with some diligence. But they didn’t.

    [Kenneth Frisch points out that there are many kinds of problems, and threads get unfocused when everything gets thrown into one… I’ll let the discussion get back to MW10.]

  9. Layman Lurker said

    I think from my recent work here on Mann07 though, we might be able to make an engineering style estimate of the true signal contained in the Mann08 proxies. That is something I haven’t read in the literature (not that it isn’t there) but since we have climate model data with the ability to add similar autocorrelation to that found in proxies, we can make an estimate. This estimate will likely be the subject of my next post.

    If you calibrated pure red noise, wouldn’t rejected proxies be negatively correlated to instrumental? Scatterplots of rejected pure red noise proxies plotted with appropriate instrumental would be interesting to see. This bias in the rejected proxies should contain information pertinent to your question, no?

    Using synthecic data where all of the S/N, correlation, and red noise properties are known, it should be possible to predict what the

  10. Layman Lurker said

    Sorry for buggering up my post in #9.

  11. Brian H said

    In targeting some of the weaknesses of the Mann-schtick, the authors definitely missed some others. Since several ways exist to show lack of validity, this may not seem important — one fatal flaw should be enough! But OTOH it’s best not to seem to issue passes to any of the invalid procedures.

    BTW — make up your mind, Jeff! “hand waiving, while it is friendly hand waiving to those who are otherwise critiqued, it is still hand waving” . Are you talking about doing without hands (“Look, Ma, No hands!”) or flapping them (“Hi, Ma, Here I am!”) 😉

  12. Brian H said

    Seveta — just to make the reply in #7 even clearer: insulation and penetration of heat into the trunk has absolutely nothing to do with growth rates. Those are determined by nutrition and energy intake through leaves, etc.

  13. Carrick said

    It does look to me like the criticism that MW10 calibrated against hemispheric temperature rather than regional scale temperature is well-founded.

    (If you have enough proxies it doesn’t matter, but with just a few proxies, you will get a net noise amplification.)

    On another issue, Jeff, do you know of any reconstructions which assumes something other than strict proportionality between temperature and the proxy response?

    E.g., at the least I’d use a model like:

    P(n) = a P(n-1) + b T(n) + c + noise

    where P(n) is the proxy at time t_n, T(n) is regional-scale weather (600-km radius mean temperature) and a,b,c are coefficients to be fit for.



  14. stan said

    I think if someone painstakingly cataloged every problem with the hockey stick, the book created thereby would have multiple volumes. There are a lot of aspects to the global warming craziness that are real gobsmackers. The almost universal and immediate acceptance of the hockey stick by the broad scientific community despite the obviousness of some of these problems ranks right up there with the worst of these gobsmackers.

  15. Jeff Id said

    #13 Nope, I’ve only seen the direct linear version. I’m not the best read on the literature though because paywalls often stop me from having too much fun.

  16. Carrick said

    Jeff, if you have any articles you need help acquiring let me know. (You know my email of course.) I can always use the expanded bibliography myself, so there is a self motive too.

    What do you think of Eli’s argument that I linked to above?

    I believe this can be directly modeled. As you know, one of the complaints I’ve personally had with climate science is the excessive reliance on hand waving, something Eli is particularly good at. I presume he’s competent enough at residual-type analyses to be able to check the magnitude of the effect….

  17. Jeff Id said

    #16, I don’t have time right now, but I left a comment at eli’s. Thank you though for the offer.

  18. Carrick said

    Most of Eli’s regular audience aren’t good at much besides spitting in people’s eyes, so I thought I’d just post this here…

    There is a problem I’ve thought of with comparing against local temperature rather than the global one.

    The short version of this is to consider what the proxy is really responding to. Usually it is other local parameters besides just temperature. E.g.,

    Equation 1:

    Proxy(t) = a * Proxy(t-tau) + b * LocalTemperature(t) + c * LocalPrecipitation(t) + d * LocalSolarExposure(t) + … + noise

    (By local, I define the mean value over one correlation length… e.g., a tapered average with a 600-km radius would do.)

    The coefficients a, b, c, d… are assumed invariant and have only to do with how the proxy relates to variation on temperature, solar exposure, etc The coefficient “a” is the simplest form of expressing “the current history depends on its past history”. Clearly tree ring growth this year depends on prior years growth for example.

    So here’s the problem: You are assuming this simplified relationship exists:

    Equation 2:

    Proxy(t) = b * LocalTemperature(t) + noise

    but LocalTemperature(t) is often correlated strongly with LocalPreciptation(t) and LocalSolarExposure(t). (What we are calling noise is often associated with LocalTemperature(t) also, but that’s another problem.)

    What goes wrong is that the correlated part of ocalPreciptation(t) for example creeps into your effective sensitivity coefficient for temperature, and the uncorrelated part creeps into the noise. E.g.,

    LocalPreciptation(t) = C_p LocalTemperature(t) + N_p

    and you are left with

    Equation 3:

    Proxy(t) = (b + C_p*c) * LocalTemperature(t) + (noise + N_p * c)

    And worse… these errors are strongly correlated over exactly the time scales you’re interested in. It could very well be that in trying to correlate against local temperature (only) that you end up with a worse estimator of the true temperature sensitivity of the proxy than you would had you only used global temperature.

    Beyond that, I’m behind the eight ball myself. Some of the work I’m doing this summer involves measuring the sound generated by this:

    UTTR blast

    The initial fireball is about 1000-feet across if you want a scale (the big ones are 45,000 lbs of TNT equivalent blast). I’ll be out in that area for about three weeks.

  19. dougie said

    #7 – Hi Don, re – your comment

    ‘Unfortunately they do not provide the “right” answer so papers about them (eg see tend to be ignored by the climate science community and the media.’

    but how do they get away with ignoring these papers? this seems in my mind, as a layman the most obvious & logical way to use tree/treeline data to map past climate changes.

  20. Jeff Id said


    That looks like an awesome job.

    I think you’ve defined the issue just right. The rabbit wants to call it a flaw, whereas it’s just a different yet similar method which may even have advantages. Although, oversampling of small spatial regions becomes a problem. Perhaps an aggregate signal from the locales of the acutal proxies would make more sense.

    Whatever, I still think the method stinks. haha.

  21. Kenneth Fritsch said

    And worse… these errors are strongly correlated over exactly the time scales you’re interested in. It could very well be that in trying to correlate against local temperature (only) that you end up with a worse estimator of the true temperature sensitivity of the proxy than you would had you only used global temperature.

    Carrick, this is a point that is made in MW 2010.

  22. Carrick said


    Carrick, this is a point that is made in MW 2010.

    Cool. I haven’t read the paper yet… if you get a chance, could you point to the relevant section of the paper?

    Jeff: I have a suspicion that if you use a model that includes autocorrleation in the proxy signal you’ll get a better correspondence to temperature (especially if you can somehow manage to reconstruct not just temperature, but temperature, precipitation, solar exposure, etc.).

    It’s actually fairly easy to see that for a proxy P depending just on temperature T for example,

    P(t) = a P(t-tau) + b T(t)

    is just a way of modeling a frequency dependence in T:

    Phat(f) = a Phat(f) e(-i omega tau) + b That(f)

    Phat(f) = b/(1 – a e(-i omega tau)) That(f)

    (Phat, That are the Fourier transforms of P and T of course)

    I’ve mumbled about determining the complex transfer function between That and Phat… this is the sort of thing that I am considering.

    What it effectively does is change the phase of small frequency amplitudes compared to large frequency ones. If you assume a string (real valued) constant to relate the proxy and the actual temperature record, what will happen is the frequencies for which That is maximum will dominate your “calibration”, and when you try and reconstruct multi-decadal records, you’ll find that other frequency components will be added out of phase…

    resulting in the observed loss of covariance.

    Anyway, that’s my theory of what may be going wrong with the reconstructions and how one might fix it.

  23. boballab said


    Carrick: MW 2010 start their Model Evaluation in section 3.1 (page 8 of the PDF) working with CRU Northern Hemisphere and work there way through. Then in section 3.6 (on page 23 of the PDF) they rerun the same steps used for CRU northern hemisphere but using “local” temperatures.

    The other part dealing with the paper is that anyone trying to go after MW 2010 based on how good the proxies are is actually attacking Mann 08. They expressly state they are working from the assumption that Mann and other Climate Scientists did a correct job and picked proxies that respond to temperature and not to something else:

    We are not interested at this stage in engaging the issues of data quality. To wit, henceforth and for the remainder of the paper, we work entirely with the data from Mann et al. (2008)3.

    We assume that the data selection, collection, and processing performed by climate scientists meets the standards of their discipline. Without taking a position on these data quality issues, we thus take the dataset as given. We further make the assumptions of linearity and stationarity of the relationship between temperature and proxies, an assumption employed throughout the climate science literature (NRC, 2006) noting that ”the stationarity of the relationship does not require stationarity of the series themselves” (NRC, 2006). Even with these substantial assumptions, the paleoclimatological reconstructive endeavor is a very difficult one and we focus on the substantive modeling problems encountered in this setting.

    You really need to read it for yourself to get teh flavor of it.

  24. Don Keiller said

    Dougie, don’t take my word for it, just look at what gets into the IPCCC reports.
    Tree rings yes- particularly in the 3rd assessment, treelines are rather less obvious.

  25. Kenneth Fritsch said

    Carrick @ Post #22:

    MW 2010 make lots of rather nuanced points in their paper and one can only appreciate the points being made by reading the entire paper. Comments out of context will not mean much in reviewing this paper, but I think we will see lots of that.

    Page 8 MW 2010:

    A critical difficulty for paleoclimatological reconstruction
    is that the temperature signal in the proxy record is surprisingly weak.
    That is, very few, if any, of the individual natural proxies, at least those
    that are uncontaminated by the documentary record, are able to explain
    an appreciable amount of the annual variation in the local instrumental
    temperature records.

    Page 9 MW 2010:

    The problem of spurious correlation arises when one takes the correlation
    of two series which are themselves highly autocorrelated and is well studied
    in the time series and econometrics literature (Yule, 1926; Granger
    and Newbold, 1974; Phillips, 1986). When two independent time series
    are non-stationary (e.g., random walk), locally non-stationary (e.g., regime
    switching), or strongly autocorrelated, then the distribution of the empirical
    correlation coefficient is surprisingly variable and is frequently large in
    absolute value (see Figure 4). Furthermore, standard model statistics (e.g.,
    t-statistics) are inaccurate and can only be corrected when the underlying
    stochastic processes are both known and modeled (and this can only be
    done for special cases).

    As can be seen in Figures 5 and 6, both the instrumental temperature
    record as well as many of the proxy sequences are not appropriately modeled
    by low order stationary autoregressive processes. The dependence
    structure in the data is clearly complex and quite evident from the graphs.

    Page 10 MW 2010:

    More quantitatively, we observe that the sample first order autocorrelation
    of the CRU Northern Hemisphere annual mean land temperature series is
    nearly .6 (with significant partial autocorrelations out to lag four). Among
    the proxy sequences, a full one-third have empirical lag one autocorrelations
    of at least .5 (see Figure 7). Thus, standard correlation coefficient
    test statistics are not reliable measures of significance for screening proxies
    against local or global temperatures series. A final more subtle and salient
    concern is that, if the screening process involves the entire instrumental
    temperature record, it corrupts the model validation process: no subsequence
    of the temperature series can be truly considered out-of-sample.

    Page 19 MW 2010:

    We are not the first to observe this effect. It was shown, in McIntyre
    and McKitrick (2005a,c), that random sequences with complex local dependence
    structures can predict temperatures…

    … Broadly, there are two components to any climate signal. The first component
    is the local time dependence made manifest by the strong autocorrelation
    structure observed in the temperature series itself. It is easily observed
    that short term future temperatures can be predicted by estimates of
    the local mean and its first derivatives (Green et al., 2009). Hence, a procedure
    that fits sequences with complex local dependencies to the instrumental
    temperature record will recover the ability of the temperature record to
    self-predict in the short run.

    The second component–long term changes in the temperature series–
    can, on the other hand, only be predicted by meaningful covariates. The
    autocorrelation structure of the temperature series does not allow for selfprediction
    in the long run…

    ..Thus, climate scientists are overoptimistic: the 149 year instrumental record has significant local time dependence and therefore far fewer independent degrees of freedom.

    Page 22 of MW 2010:

    On the other hand, Brownian motions and other pseudo-proxies with
    strong local dependencies are quite suited to interpolation since their insample
    forecasts are fitted to approximately match the the training sequence
    datapoints that are adjacent to the initial and final points of a test block.
    Nevertheless, true proxies also have strong local dependence structure since
    they are temperature surrogates and therefore should similarly match these
    datapoints of the training sequence. Furthermore, unlike pseudo-proxies,
    true proxies are not independent of temperature (in fact, the scientific presumption
    is that they are predictive of it). Therefore, proxy interpolations on
    interior holdout blocks should be expected to outperform pseudo-proxy
    forecasts notwithstanding the above.

  26. Kenneth Fritsch said

    Also look at Section 3.6 and Fig. 9 for analyses and discussion of using local local versus NH mean temperatures for calibrating using the Lasso method.

  27. Carrick said

    Thanks for your effort guys. Amongst other things, this documentation appears to demonstrate that Eli never read the paper he was critiquing. I will read the paper in time, I just wanted to get my own ideas about how it should be done and the tradeoffs settled between approaches before I looked at what they did.


  28. Earle Williams said


    I think Josh deserves an attaboy for delivering a criticism to the best of his ability.

  29. cohenite said

    “Eli never read the paper he was critiquing”; yep; that would be the more favourable interpretation; this is a great paper; these boys know their stuff and are taking the piss; I like this:

    “We further make the assumptions of linearity and stationarity of the relationship between temperature and proxies, an assumption employed throughout the climate science literature (NRC, 2006) noting that ”the stationarity of the relationship does not require stationarity of the series themselves” (NRC, 2006).”

    Well Mann made sure the series were stationary; he did everything except nail the things to the floor. Imagine what M&W would find if they reasonably discarded these unreasonable assumptions.

  30. […] the Air Vent: MW10 – Some thoughts Climate Audit: McShane and Wyner 2010 William M. Briggs: The McShane and Wyner Gordie Howe Treatment Of Mann Deep Climate: McShane and Wyner 2010 Deltoid: A new Hockey Stick: McShane and Wyner 2010 Rabett Run: A Flat New Puzzler Klimazweibal: McShane and Wyner on climate reconstruction methods […]

  31. Szerb fan said

    How long is it likely to be before someone re-engineers McShane & Wyner (a) with Tiljander the right way round or maybe excluded; (b) without bristlecone pines; (c) without both of the above?

    Perhaps a suggestion that could be put to the authors? I know they wanted to steer clear of the climatology side and focus on the stats, but maybe just one of the above will be enough to change the shape of the reconstruction significantly.

  32. I think what you published made a bunch of sense.
    But, what about this? what if you typed a catchier title? I ain’t saying your information is not good., but what if you added a headline to maybe get a person’s attention?
    I mean MW10 – Some thoughts the Air Vent is kinda boring.
    You might peek at Yahoo’s home page and watch how they create post titles to grab people interested. You might add a video or a related picture or two to grab readers interested about everything’ve written.
    Just my opinion, it would make your posts a little bit more interesting.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: