Proxy Methods

I was lucky enough to get some time to spend at the ICCC today.  Realizing I’m president, I simply left work and drove to Chicago, turns out nobody fired me.  I had an amazing conversation with Lucia about her PhD work, which surprisingly enough I had some background in, I met Craig Loehle for the first time (he doesn’t look like I’d expected, far younger 😉 ), saw Anthony Watts briefly and I got to spend about an hour with Steve McIntyre discussing hockey stick math – how fun is that! Cool day all in all.

We don’t talk about hockey sticks enough here lately, but after my conversations over the last couple of days I think a brief discussion of the math of hockey sticks – aka paleoclimate reconstructions is in order.  I’m afraid I’m not planning anything with enough written math for some of you but rather another attempt a generic explanation of how hockey stick paleo reconstructions are made and go wrong.  Perhaps those who already know this topic can help explain it to the rest as simplifying the subject is important.  Not everyone knows, or cares to know, how to perform a multivariate regression – not that it’s impossibly hard.  Most  of this article has to do with tree proxy’s but it applies to different varieties as well.

First, the methods of paleoclimate temperature reconstructions are very similar in one aspect.  They take items which are assumed (blindly) to have a temperature signal.   Blindly is a criticism but it’s also completely real, nobody has tested whether a bristle-cone pine responds to temperature in a linear fashion, or even if it responds by growing faster, it just makes sense but it’s completely unproven.   Growing 1.2 times as fast per year in a 0.1C warmer  temp and 1.4 times in a 0.2C temp.  Nobody has demonstrated this, but it is assumed.

Second, nobody knows how much faster a tree will grow per degree C.  So if 100 trees of the same species have their ring widths measured nobody knows whether X millimeters equals Y degrees C and nobody knows whether each tree is different from each other.  In paleo-climate a different weight for each tree is the preferred solution.

Third, experts know that the standard growth rate of an unmolested (same temp, same humidity, same everything) tree, is non-linear.  Very much ad-hoc methods are used to flatten the growth curves, RCS flattens the general shape by fitting curves like exponential decays to tree widths, yet nobody has determined what the unmolested growth curve of a tree would be.  Lake sediments, mollusk shells, on and on, other proxies are no better.

So we have unknown proxy curves with unknown calibrations having unknown linearity, the solution is to combine them with misunderstood statistics.   I mean why not, none of the data is verified, NONE OF IT, so why not use whatever method might mash it together.

———-

Let’s talk math.

Proxy reconstructions that I’ve been exposed to, a considerable number at this point, all consist of linearly weighted combinations of the data.  I’m talking about this

temperature = Weight(1) * proxy (1) + Weight(2) * proxy (2)……..Weight(n) * proxy(n)

There are complex versions of this weighting where the weights change on a data availability basis, but it doesn’t change the concept.

What happens though when one of these complex methods creates a negative weight?

First, the regressions are all trying to match a positive temperature trend.  If the proxy has a downslope, or an inconvenient anti-correlation to short term fluctuations in temperature, it can receive a negative weight from methods such as EIV.  A point Mann has made in the past.

The claim that ‘‘upside down’’ data were used is bizarre. Multivariate regression methods are insensitive to the sign of predictors.

This reads very clearly to most here, but the predictors are the proxies, multivariate regression is the method which really doesn’t care if the sign is upside right or upside down.  The point of the methods are to match the measured temperature curve using noisy data, so for every proxy which is used upside down, a different proxy must counter the effect.  They’re all temperature after all, and they all should be warming over decadal time scales — but they don’t.

So when a temp curve is inverted with a negative weight, another temp curve must compensate.

Steve McIntyre made the point this weekend that in multivariate regression you want the predictors to be orthogonal, if they are the not your matrix is singular (or near singular) and your weights can shift to large and opposite values very quickly as they work to cancel the noise  and match the temperature signal.  In the case of proxy multivariate regressions, if all the proxies contain the same signal (or nearly the same), it’s only the noise which provides any orthogonality, the rest of the matrix is nearly singular.  It’s a nice way to think of the problem faced by these regressions.

If they are near singular matrices straight mv regression will create weights which are both extreme positive and extreme negative which then combine to match the signal in the calibration range very well.  However, temperature is temperature and negative weights mean that we’re reading temperature – upside down.   A no no in most circles.

Let’s take a moment to consider what the best possible weighting would be for near singular proxies.  Proxies with the same signal underlying the noise.   If they were all scaled to a reasonably similar variance, and they all contain some temperature signal plus a bit of red noise (Mann assumes a very high 0.4 in his 07 paper – Signal to noise), then your best possible mathematical weighting would be equal weights or an average.  The resulting average  curve could then be scaled to temperature.

It’s hard to beat an average.

But when you begin regressing the same near singular matrix, the weights go all over the place, one side compensating for the other until values are pushed to extremes.  These weights will match the temperature a reconstruction is regressing to, better than a simple average – by definition, but as they do, the weights quickly become physically meaningless.

Why would you take two trees of equal variance and weight one tree 2 times and another -1.   Physically it defies the definition of the temperature proxy, it preferentially chooses the pretty shape of the 2 times tree rings and leaves us with a worse result in the all-important reconstruction period.  Of course, the two series produce a far better match to actual temperature in the calibration period, but the weights are based entirely on the orthogonal information of the proxies — the noise, and any spatial information recovered from these reweighted proxy thermometers is horribly corrupted.  If the historic signals were clean temperature signals, one proxy would show a historic warming of 2X and the other would show a -1x temperature.  A clearly non-physical result, but Mann sophist answer is right, multivariate methods are insensitive to sign.

This is not what causes the historic variance loss so often discussed here.  The loss or de-amplificcation of signal in the historic portion of reconstructions that guarantees unprecedentedness in a reconstruction is created through preferential noise selection.  We’ll talk about that at a later time though, this is about achieving a physically meaningful result from a regression.

Truncated least squares methods are one way that  near singular data can be constrained to a more reasonable result.  These methods limit the amount of information used in the regression to K PC’s, giving enough DOF to achieve some variable weighting without allowing too much freedom to achieve the full multivariate result.  SteveM has repeatedly written that as more PC’s are added we get further from the true result (equal or near equal positive weights) allowing more of the +/- weighting to creep into the algorithm resulting in overfitting of the remaining data.  In the upcoming Antarctic paper- still pending review, a considerable amount of effort was put into finding the correct truncation parameters to prevent overfitting while insuring that relevant temperature information was not extended across the continent.

The methods are complex enough that any lack of care results in non-physical results such as negative weights or equally bad– extreme weights. Current methods such as CPS and various regressions are too often being incorrectly applied in paleo-science with no concern for the physical meaning of the result.  As an engineer, I have a hard time understanding why there isn’t more concern about the physicality of meaning of the weights as well as the unverified data.  I mean consider that since the proxies are unverified thermometers, extreme, negative, or near zero weights go against the original assumption that these proxies are responding ‘linearly’ to temperature.  The scientists ignore their own assumption.

I’m a little tired of writing now, but will try to continue this in another post.  This was part of the discussion I had ad the ICCC conference with SteveM, I bet you wish you could have been there 😀

79 thoughts on “Proxy Methods

  1. Thanks, Jeff, for the explanation. If I understand you correctly, the dendroclimate data is de facto spongy, freighted with theoretical and statistical risk.

    I guess my question is kind of simple. When Mann “fudged” the data, did he do so because he believed doing so was (for lack of a better term) a scientifically (and statistically) neutral move?

    Or did he do so because he understood that the “decline” indicated by the dendro-data invalidated his basic CAGW thesis?

    In the former case, what he did could be seen as a legitimate choice, no matter what one feels about his doing so. In the latter case, we have a case of clear and deliberate fraud.

    Of course, we’ll never know.

  2. #1, Actually, in ‘hide the decline’ it was Briffa’s data which was concealed.

    Mann’s reconstructions from the 98,99 timeframe were based on messed up PCA methods so they don’t really apply to proper regression math. Not that I’m not suspicious of Mann’s intentions at times, but this post isn’t about fraud or intent but rather problems with the typical methods when properly applied.

  3. Why cannot it be that the reconstruction didn’t track the recent instrument record, we don’t know why (yet), but we’re working on it, and until we get a handle on it, let’s avoid advertising this inadequacy?

    My reading of Briffa is that he knows he’s not there yet and he’s working on it. Too bad some of these guys have to represent these studies or reconstructions as having more utility than they do.

  4. You wrote,

    in multivariate regression you want the predictors to be orthogonal, if they are not your matrix is singular (or near singular) and your weights become unpredictable very quickly. In the case of proxy multivariate regressions, if all the proxies contain the same signal (or nearly the same), it’s only the noise which provides any orthogonality, the rest of the matrix is nearly singular.

    I lack the math background to understand this. In terms of vocabulary, are “predictors” the temperature-related parts of the proxy signals? By “orthogonal”, do you mean proxies that contain diverse “pieces” of the temperature signal–rather than having common bits of temperature signal represented often, while other bits of the temperature signal are missing entirely?

    I expect this is way off base. Back to lurking.

  5. #5 the proxies (predictors) are used to create temperature, if they all contain the same signal, combining them mathematically is an unstable proposition – singular.

    I will reword it though.

  6. Re: the Divergence Problem (#2) —

    I just learned about this; it seems to be a useful window into the way that paleoclimate reconstructionists think.

    Dendrochronologists had been comparing tree-ring measures to temperatures for the instrumental period (~1881 on) to establish the relationship of the former to the latter (i.e. a curve with a slope and a correlation coefficient r^2). However, in 1998, Keith Briffa wrote a Letter to Nature (behind paywall), commenting on the analysis of recent tree-ring data. He noted that the correlation that had been established over ~1881 to ~1960 didn’t hold from ~1960 through the Eighties and Nineties. The slope of the line was different–even going from positive to negative–and r^2 dropped. He listed a number of possible causes. Basically, “I don’t know.”

    The point of the entire exercise was to establish a line with a slope and an r^2, and use it to determine temperatures of the pre-instrumental era, on the basis of tree-rings.

    So Briffa’s unwelcome news left the field with two choices:

    (1) As before, add new information on tree rings and temperatures to the data from which the calibration curve would be calculated, to generate the most comprehensive calibration curve possible. This would necessarily mean that the “current” curve would have a different slope and a lower r^2 than the “historical” curves that had been built only from pre-1960 data.

    (2) Having examined the correspondence of the new tree-ring data to temperature and found it undesirable, decide to discontinue amending the calibration curve. Evaluate historical tree-ring data from the pre-instrumental period with calibration curves that exclude post-1960 data. The exclusion is justified on the basis that the post-1960 tree-ring data are in some way “bad” or “contaminated” by factors that are not relevant to tree-ring growth of the pre-instrumental period.

    Here is the relevant text on the Divergence Problem from IPCC’s AR4, Working Group 1, Chapter 6.6. As can be seen, Briffa picked #2 (emphasis added); it seems to be the consensus choice of the field.

    …Several analyses of ring width and ring density chronologies, with otherwise well-established sensitivity to temperature, have shown that they do not emulate the general warming trend evident in instrumental temperature records over recent decades, although they do track the warming that occurred during the early part of the 20th century and they continue to maintain a good correlation with observed temperatures over the full instrumental period at the interannual time scale (Briffa et al., 2004; D’Arrigo, 2006). This ‘divergence’ is apparently restricted to some northern, high-latitude regions, but it is certainly not ubiquitous even there. In their large-scale reconstructions based on tree ring density data, Briffa et al. (2001) specifically excluded the post-1960 data in their calibration against instrumental records, to avoid biasing the estimation of the earlier reconstructions (hence they are not shown in Figure 6.10), implicitly assuming that the ‘divergence’ was a uniquely recent phenomenon, as has also been argued by Cook et al. (2004a). Others, however, argue for a breakdown in the assumed linear tree growth response to continued warming, invoking a possible threshold exceedance beyond which moisture stress now limits further growth (D’Arrigo et al., 2004)…

    There are some significant statistical problems with post-hoc analytical approaches. In the pharma and medical-device world, great effort is put into the design of pivotal clinical trials to avoid having to undertake retrospective analyses, and to anticipate the issues that accompany subgroup analysis. Here’s a walk-through of this as a recent Blackboard comment.

    The paleoclimate community seems to be uninterested in these types of considerations.

  7. Your points one, two, and three would, each on its own, be a firing offense in the physics department. There’s a point four which you didn’t mention, tree growth depends on multiple factors,not just temperature. Hence, by the laws of algebra and geometry, you need as many equations as factors to extract the temperature. Again, this is hand waved away by claiming that trees at the tree line are special.

    The whole field stinks and it’s definitely not science. I’m always amazed that you and McIntyre and have the stomach to keep digging anyway, I couldn’t do it. You have my gratitude for exposing this trash.

  8. 10-“this is hand waved away by claiming that trees at the tree line are special.”

    On top of this, is the stupid assumption that, even though we are assuming the trees are recording fluctuations in temperature at the tree line, the tree line never moves! Preposterous, if it gets warmer the trees at the margin won’t be anymore and therefore won’t be “special” anymore either.

  9. Re: Paul Linsay (May 19 11:05),

    There’s a point four which you didn’t mention, tree growth depends on multiple factors,not just temperature.

    That’s the orthogonality issue, so he did mention it, sort of. Ideally each growth factor would be independent, i.e. orthogonal, and would be separated into its own principal component (PC) by PC analysis (PCA). But, of course, that’s not true. If the dendro’s could show that divergence was caused by an issue specific to something that happened after 1960 that could not have happened during the reconstruction period, then it would be fair to not include data after 1960. But they can’t. They only assert that this might be true.

  10. It’s been a while, but I seem to remember that ANY function can be approximated to an ARBITRARY CLOSENESS by a linear combination of orthoganol functions. We apply this princiiple when we solve partial differential equations using Fourier (or other) series.

    Therefore, to fit the instrumental temperature record, all you need is a large enough collection of garbage (proxies) that happens to be more-or-less orthoganol. And the more garbage (proxies) you have, the better the fit; because the multi-variate regression creates nothing other than a linear combination of orthogonal garbage.

    Of course, the history that is inferred (where you have no instrumental data) will be complete nonsense because none of the functions (proxies) has anything to do with temperature, they just happen to be more-or-less orthoganol.

    Do I have this about right. And if so, why is there debate?

    Surely I’m missing something.

  11. If one were to have a variable weight per x (time scale), why was the data post 1960 NOT weighted UP to match the thermometer?

  12. #14, The methods provide the best fit to temperature. If Briffa’s proxies were shoved into these methods, they would be de-weighted or flipped over to provide the best match to temp. CPS in mann 08 would simply reject them, his answer was to chop off the ends and paste on upright blades before using them.

    #13, I don’t see anything wrong with your post except that the proxies are assumed to be non-orthogonal. Although, they aren’t checked for that.

    I just left a message at RC requesting a location for the pseudoproxies form Mann07.

  13. Wait a sec:

    If they are assumed to non-orthogonal, then the matrix will be singular and weighting factors will be fitting noise, so multi-variate regression is the wrong technique.

    Seems like multi-variate regression will fit the instrumental record best when the proxies aren’t temperature at all – just orthogonal.

    Maybe that was your point.

    So your plan is to test Mann’s proxies for orthogonality.

  14. #16, Someone said SteveM has already done that, but I’m interested in his pseudoproxies and how they compare to the real ones. I wonder if Mann will help me find them.

  15. I’ve wanted to rant on this for a while. ALL measurements are by proxy. We don’t measure temperature, we look at the position of the fluid in a thermometer. And make a whole bunch of assumptions about the construction and calibration thereof. We measure effects and assume causes. The farther away, like tree rings, we are from causes, the greater the assumptions. And the greater the chance of error. No matter what analysis method, the basic data has to have some correlation to the thing you want to measure. Tree rings may respond to temperature, but how much? And what other factors have a greater effect? As I understand it, Briffa’s results came from a single Yamal tree. That has to have a very low signal to noise ratio.

    The same applies to almost all other measurements. Earth’s temperature from satellites measuring UHF amplitude that may come from many things besides temperature? Ice thickness and sea level from satellites whose orbital variation exceeds the claimed measurement accuracy by a factor of 100 or more?

    Sure, you can ‘calibrate’ these things, but the method has to have much greater accuracy than the thing you want to calibrate. Ideally, ten to one or more. Having followed this discussion for several years, I don’t see anything close to that.

    I am sure with a proper analysis method, or ‘adjustment’, you can get any answer you want. It certainly seems to have worked that way so far.

  16. #16, If the proxies carry a general warming signal with noise caused by other factors and we fit to a cleaner temp signal, we are really re-weighting according to the noise when the ideal weights of similar variance proxies would be equal. Some could probably contend that point on the basis that different trees have different temperature responses but if they should never go negative (by the initial assumption) and they shouldn’t vary by much.

  17. Jeff,

    I don’t want to drag out our conversation, so stop when you want.

    What exactly are the proxies> I know they are tree rings. But, are they widths (in cm) or a density in (whatever unit). Have these been normalized – at one point I thought the rings were normalized for species and for age (older trees grow slower). Is each proxy an individual tree or is it the average of a group or the average of that species. Does multi-proxy mean multiple trees, multiple species, or does it include things like clamshell fossils (or whatever).

    Sorry for the dumb questions.

    BTW, when steve tested, were the proxies orthogonal or not?

  18. Suppose I have one proxy, but a number of measurement series. How should each be weighted in the multiproxy reconstruction?

    My data source might be, say, a lengthy core of sediments that I recovered from the bottom of a lake (since they are “varved,” I can date them, analogous to tree-rings).

    I might have access to, say, four measurement series. They could be, for each varve:

    * Thickness in mm

    * A tally of the light-colored particles, to assay the deposition of inorganic material

    * A tally of the dark-colored particles, to assay the deposition of organic material

    * The X-Ray density of the sample, determined by digitizing X-Ray film

    For a multiproxy reconstruction: should these four data series be weighted independently, as they each concern a somewhat-different property of the sediment core? Or should the single core receive a single weight? Is there a non-ad-hoc approach to answering this question?

  19. 1) They sometimes use ring width and sometimes wood density–and argue over which is best.
    2) Trees respond to temperature. You can model this over the calibration period, but to determine temperature in the past you would need precipitation data for the past to factor it out–no one has that ever, so no answer can be found for a full model for past periods if you include precip in the model (and it usually turns out to be significant).
    3) Even at treeline there is a response to precip (see item 2).
    4) What happens if you assume linearity and it is nonlinear? see: Loehle, C. 2009. A Mathematical Analysis of the Divergence Problem in Dendroclimatology. Climatic Change 94:233-245
    available at http://www.ncasi.org//Publications/Detail.aspx?id=3273

  20. What is the point of linearly combining the proxies?

    If we use Amacs example, I would fit each of the measurement series as a function of temperature. To reconstruct a temperature history, I’d only use the data that were well described using only temperature.

    I can’t see the rational for trying to describe temperature as a linear combinatino of different proxies. Seems like a cause and effect fallacy. And even if I did it, what do the weights mean?

  21. Craig, some questions wrt precipitation:
    1. are precipitation trends independant of temperature?
    2. Does precipitation vary on millenial time scales?
    3. Could a precipitation sensitive proxy reconstruction be orthogonal? (in contrast to the theoretical non-orthogonal character of a purely temp proxy which Jeff has suggested)

  22. Re: 26. Precipitation trends can differ from temperature trends, for example if paleo climate affected location of the jet stream which brings precip.. Yes precip varies on millenial time scales. Last ice age was cold and very dry most everywhere. If precip is the main driver, then you may be better off than with a temperature proxy. For example, it has been shown that Rocky Mtn tree rings nicely predict the outflow of the Columbia River over 400 years, because tree growth is limited by precip–once it gets dry the trees stop growing, and the effect is nearly linear. But since trees often respond to both temp & precip, how would you separate these effects in the past when all you have is ring width (no precip data)?

  23. #27

    It would be difficult to separate the effects as you suggest, however, if temp proxies should be non-orthogonal as Jeff suggests, is it possible to demonstrate the presence of a second factor contributing to trw with a properly constrained algorithm?

  24. Re: 28. It is quite common in tree ring studies that are NOT attempting to do paleo reconstructions but rather to understand tree growth to find moisture at some season to be a major predictor. To then simply leave moisture out because you assume the tree to be temperature limited seems…suspect? Sloppy? But if you leave it in you can’t predict temp from ring width (or density) because it is no longer univariate.

  25. #26: A particularly interesting illustration of the issues you bring up is that of the bristlecone pine tree rings from California and Colorado that formed the most heavily weighted proxy set of the original “hockey stick” study. Their 20th century growth pattern correlated well with the global (not local, but that’s a different story) temperature history for the century to that point, and their patterns went back at least 1000 years.

    We know from many threads of evidence that the mountains where these trees are found were very warm during the medieval period (tree lines were 1000-1500ft [300-500m] higher then, implying temps 3-4C warmer than at present), but also very dry (lake levels were far lower then — Fallen Leaf Lake, where I spend a week every summer, very near the California bristlecones, was at least 120 ft lower than present from at least 1000CE – 1200CE, known from trees pulled up off the present lakebed) — what many scientists have called a “megadrought”.

    The BCP tree rings were very narrow during that period. The hockey stick algorithm interpreted this as evidence of low temperatures, enabling the claim that the 1990s were the warmest decade in the past millennium. But how do we know this was not due to the low precipitation?

  26. To then simply leave moisture out because you assume the tree to be temperature limited seems…suspect? Sloppy? But if you leave it in you can’t predict temp from ring width (or density) because it is no longer univariate.

    Jeff, Craig, correct me if I’m wrong, but I think the suggestion is that relatively even proxy weighting would be an expectation which is consistent with true temp proxies plus random red noise. If we know that the algorithm is properly constrained and resulting proxy weighting is significantly uneven, does this suggest that there *must* be another factor(s) even if it can’t be separated?

  27. #30 Curt

    Interesting that you should mention that. I thought of the same thing during this discussion and went back and skimmed MM05(GRL).

  28. #31: If some proxies receive very high weighting compared to others, this is probably an indicator of spurious correlation. If they are true proxies, one should just average them.

  29. In a slightly simplified (but I think entirely accurate sense, and assuming that any two proxies that represent temperature are non-orthogonal: the fitting technique is attempting to describe a temperature vector as the product of a weighting vector and a proxie matrix. There are, then, three possibilities.

    1) The proxies are temperature thus non-orthogonal. The proxie matrix is singular. You presumably can get a good fit to the temperature vector, but the weighting vector means NOTHING!! because there are an infinite number of vectors that will give the exact same fit – in algebraic terms: the system is underspecified. The proxies may indeed be useful for reconstructing climate, but not via this technique.

    2) The proxies are not temperature, but are non-orthogonal. The proxie matrix is singular. You presumably get a bad fit to the temperature vector. Again the weighting vector means nothing. You have wasted your time.

    3) The proxies are orthogonal. In this case, no more than one of the proxies can actually represent temperature. You can fit the temperature vector to an arbitrary closeness by increasing the number of orthogonal proxies, but you cannot reconstruct climate because the proxies aren’t temperature – just math abstracts that, when added, approximate the temperature vector.

    Sorry for so many posts. Guess I got a little fired up. I’ll tone it back, and leave room for those that actually know something about climate science.

  30. #34, I like it, nice job, but there is another possibility.

    The proxy data you want to see are correlated generally and not orthogonal on longer timeframes, but the noise is created by non-random processes which are also not orthogonal. As Craig Loehle showed above temperature may not even be the main signal.

    Therefore weighting in the case of 1, with low S/N there is enough difference between proxies that the weighting is important yet the signal is such a small fraction of the noise the weights become random results based on other factors.

    Yes it is that undetermined, but it’s a fun problem!

    If you haven’t checked it out, you can figure out these posts pretty quickly.

    Warning though, this is the reason I have a climate blog. You might break your brain.

    https://noconsensus.wordpress.com/2009/06/20/hockey-stick-cps-revisited-part-1/

    https://noconsensus.wordpress.com/2009/06/23/histori-hockey-stick-pt-2/

  31. #34, by ‘break your brain’ I mean that I couldn’t believe that more wasn’t done. The comment wasn’t intended to be an aloof cocky thing but if you read it other than I intended, it might be read that way. When I learned what was happening with hockey sticks, I stayed up until almost 3 am thinking about them. After that time tAV’s blog style changed dramatically.

  32. Trouble with tree rings is simple. Width of tree rings has nothing to do with temperature and everything to do with rainfall. Good wet years give wide rings, drought years give narrow rings. Doesn’t matter what the temperature is, wet and cold, wet and hot, same same, wide ring.
    You can apply any sort of mathematical manipulations you like, but width of rings won’t become a measure of temperature, it’s a measure of rainfall. You can correct and plot and call it temperature, but its still rainfall.
    That’s why they dropped off tree ring data after about 1960, it didn’t match temperatures recorded by thermometers.

  33. #37, An interesting statement which from my aeronautical perspective, cannot be verified or disproved. Do you have any references (peer or otherwise) for your comment.

  34. Re: kdk33 (May 19 18:23),
    There’s been loose talk about orthogonality and singularity, which I think is going to lead to misunderstanding. You’d like proxies to be orthogonal, but it never happens. But there’s a long way between being non-orthogonal and getting a singular matrix. They don’t have to be orthogonal to be usable.

  35. It all makes sense to me, shame paleo scientists cant seem to understand the issue!

    As a engineer (and not a statistics master) the whole issue seems bizare. If the trees have an assumed linear and uniform response to temperature, then simply calculate an area weighted mean of the data – if their assumptions are correct about the tree response to temperature, they should all just read like a thermometor and have a similar record. Why arent they combined in the same way as thermometor data then?

    Does this make sense?

    Why weight the data to fit a particular set of data, I know the idea is to scale it to convert it to a scale compatible with temperature data, but in doing so the actual pattern in the data is skewed for data that matches better. Surely a better approach would be to do a area weighted mean (based on coverage area i.e. theissan polygons) and THEN scale the resulting profile to real temperature data, at least then the profile isnt corrupted and bears some resemblance to temperature data, though thats a big assumption!

  36. What is wrong with this logic?

    1) We know that a certain variable (width, density) is ‘sensitive’ to temperature
    2) We know that the 20th century has shown increasing temperatures
    3) We should therefore pick proxies which show 20th century warming
    4) Our PC reconstruction method should therefore extract this 20th century trend into the PC1

    Thanks

  37. Nick,

    To be clear: I’m not claiming expertise (in fact, quite the opposite), I’m trying to understand. Perhaps you can help.

    Let’s say I have 3 trees that are excellent thermometers. Their ring-widths have a linear response to temperature and are described by A+B*Temp, but each tree has a different sensitivity so a slightly different B: (A+B1*temp, A+B2*Temp, A+B3*Temp). So the ring width of any tree can be described as a linear combination of the ring width of the other two trees.

    Are these orthogonal?

  38. (Did I promise to shutup?)

    This fitting technique would seem appropriate if I wanted to model ring-width as a function of multiple factors – say temperature & precipitation & insect activity & etc – and I had a proxy for each of these factors. Then I would expect the proxies to be more or less orthogonal and the weighting vector would quantify the impact of these factors (measured by proxy) on the tree ring width.

    As applied here: we are describing the instrumental temperature data as a linear combination of… thermometers(!).

  39. #42 Shub Niggurath

    1) We know that a certain variable (width, density) is ‘sensitive’ to temperature

    There is more than one factor that determines tree ring width/density.

    2) We know that the 20th century has shown increasing temperatures

    3) We should therefore pick proxies which show 20th century warming

    You have now picked trees whose rings widths match 20th century temperatures. The ones you have discarded where discarded solely because they did not match the instrument record. What criteria do you use to discard erroneous tree rings from before the instrument record?

    4) Our PC reconstruction method should therefore extract this 20th century trend into the PC1

    Since you chose trees that match the instrument record your extracted data will simply be another version of the instrument record. As you dont have any criteria for discarding trees prior to the instrument record all you are left with is a nice squiggly line prior to the instrument record that may or may not bear some relationship to past temperature.

  40. Two related but different jumping-off points:

    (1) We can meaningfully calibrate data such as tree-ring series to the instrumental record (1881-present). Once established, we can extract meaningful temperature values from the pre-instrumental portions of the data series, and thus construct a proxy-based paleoclimate history. The key task is to select or create the best methods for these purposes.

    (2) We might be able to meaningfully calibrate data such as tree-ring series to the instrumental record (1881-present). Once established, we might be able to extract meaningful temperature values from the pre-instrumental portions of the data series. Since we are certain to generate output with the form of a paleoclimate history, the key task is to select or create quantitative and qualitative methods that rigorously establish confidence intervals — in the knowledge that these intervals might be so wide as to render the entire exercise meaningless.

  41. I really do have to start my day job. But, I keep performing the following thought experiment:

    Let the temperature proxy data be composed of two parts: the (presumed) temperature signal plus noise. Let’s assert that if there is a temperature signal, then the proxy responds to temperature in a linear fashion. Let’s assert that, if the proxies contain a temperature signal, then the signal portions of the data are non-orthogonal. Let’s assert that the noise is random, hence orthogonal. Let’s remember that ANY function can be described to an ARBITRARY closeness using a linear combination of orthogonal functions.

    Now, when we try to describe instrumental temperature data as a linear combination of proxy data we can break that exercise into two parts: 1) fitting the temperature data as a function of the (presumed) temperature component of the proxy data, 2) weighting the temperature data so that the noise component of the proxy data is minimized.

    But wait, when we do 1, because the temperature component of the proxy data is non-orthogonal, the weighting vector is meaningless – there are an infinite number of weighting vectors that will give the exact same fit. So, 2 selects the weighting vector that minimizes the noise. At best, the exercise is meaningless because the so selected weighting vector is not any more “right” than any of the other possible weighting vectors – it was chosen by the noise, which isn’t information. When you then reconstruct past climate (with no instrumental data) the best you can hope for is: no harm done. And there is a significant risk that by letting the noise weight the data, you’ve introduced spurious “information”.

    But wait, what if the proxies really aren’t very good thermometers – the presumed temperature component is just noise or is a combination of several factors that are acting differently on each proxy (say insects in one location, but not in another; more rain in one place than another; etc.). Then, the temperature component will be more or less random, hence orthogonal. Then the entire exercise (1 and 2) is simply weighting orthogonal data to fit the instrumental temperature data. This is GAURANTEED to work – we know that any function can be approximated using orthogonal functions. So, the weighting vector will arrange the “noise” to fit to the instrumental data. But when you reconstruct climate, in the non-instrumental period the noise will just cancel itself out – because it’s noise. If the instrumental data shows an increasing temperature trend your temperature reconstruction will be the following: an increasing temperature in the instrumental period (guaranteed by your fitting technique) and a mostly flat temperature profile in the pre-instrumental period because the data is actually noise. You get: A HOCKEY STICK!!

  42. Just a note that some people are mixing up the model used for a single tree (temp + precip) with the question of combining proxies.
    #48–yes, very good.
    Another key point is the stationarity assumption (dendro term): that trees that are good thermometers today were also good in the past. Since the environment around a tree, particularly the nearby competing trees, is NEVER the same over time, this is not valid. Even if it were, often fossil wood is used (e.g., in Yamal) in which case one has no way of judging if the fossil tree was a good thermometer or not. Further, nonstationarity means that the relationship of the tree to temp in the past was likely not the same. Consider a tree whose neighbors just died. It will show a big response to temp. But in the past it was crowded, and will be unresponsive.
    Finally, note that the origin of the field was for 1) evaluating current and recent past climate effects on trees and 2) dating events in the distant past like volcanoes or big droughts. The classic text by Fritz (1976) actually warns about the nonlinear growth response of trees likely to be a problem for dates very far back in time.

  43. I haven’t read all the comments so this may be a repeat.

    1. Gather data from a group of trees.
    2. Pick trees out of that set that match the temperature record.
    3. When the trees no longer match the temp. record – hide the decline

    The error in the method? Given a large enough sample you can get to #2 by random variation. i.e. there is no proof that trees are thermometers. The result is #3 (to keep the funds coming)

    This is the no math layman’s short explanation.

  44. TerryS
    “What criteria do you use to discard erroneous tree rings from before the instrument record?”

    I do this in two ways.
    First – I include tree-ring data continuous with the instrumental data, but extending backward beyond the start of the instrumental record. This includes data that is partially overlaps with the instrumental series.

    Second, to go even backward, I use the trees/other records which show good correlation with the *previous record/s showing good correlation with the instrumental record*. This way, I keep going back and back – ‘stepwise’. 😉

    Once I have assembled my network thus, I run my PC analysis which extracts the relation seen in the most closely correlated portion of the network (i.e., trees which responded in late 20th century) for the whole period.

    What is wrong with this method? 🙂

    Please note: I am not being sarcastic or anything. The other side believes this type of logic to be perfectly fine. I am trying to understand that.

  45. #51: please see my post 49. your method which they assume works is based on the assumption that the trees which have a good correlation to temp in recent years also show a good relation (and the same relation) in the past–but there is NO proof of this ever given. If it is drier in the past in a region, the same trees that are temp responsive in recent years will become moisture proxies in the past, etc.

  46. #51

    The problem comes in with screening noisy data based on correlation with temp. It is a method which guarentees that random noise will be interpreted as signal. The shape of the “blade” of the is a given – regardless of the ratio of signal to noise in selected data – because it is definded by the selection criteria. However, the expression of the signal in the reconstruction period (pre-insturmental) will be attenuated due to presence of random noise. The greater the noise relative to signal, the greater the effect. IOW, the amplitude of the MWP would shrink while the amplitude of the insturmental period is defined.

  47. #38.
    It’s common knowledge among anyone who gardens, cuts wood, or grows plants that adequate water is the most important thing, followed by access to sunlight. This wikipedia article on dendrochronology flat out states that rainfall is the controlling factor. In fact, dendrochronology originated in the water short American southwest where the ring patterns are most pronounced due to the come and go nature of the rainfall.
    http://www.learnnc.org/lp/pages/1008

  48. What do people (Mannists) claim as the physical meaning of the linear combination of proxy data that results from this technique. What do they claim is the physical meaning for the weighting vector.

  49. #55: they claim that the weighting factor indicates that a particular proxy (tree location or sediment record) is a “better” indicator of temperature. It is a circular argument that ignores the noise problem and the possibility of spurious results.

  50. David Starr, remember that they try and pick tree proxies that are “temperature limited”. This involves site selection as well as species selection.

    For example a tree on a treeline might look temperature limited in its growth pattern.

    Lots of problems with this IMO, though: 1) The treeline can shift, and if it does the tree can change from a “creeping” bush-like form to a “normal” tree-like form. When that happens, you get a sudden spurt of growth. Secondly if the tree is not in the margin the full period you are trying to use it as a proxy for, it will “shift in and out” of being a good temperature proxy.

    IMO, the best way to handle this is to completely avoid tree lines and other marginal growth patterns, and select for where trees would remain in a robust growth form, then just calibrate for temperature, precipitation and cloud coverage.

    In general though, I agree with your premise, the relationship between tree ring growth and temperature is multivariate. The big problem with this is the relationship between temperature and precipitation will in general change over time, which means the idea there is a one-to-one mapping between temperature and tree ring size is just false.

    That doesn’t even address issues about sample collection. If you take samples from living trees, it’s pretty hard to correct for shifts in tree ring patterns over time. See, e.g. this. The green lines approximately follow the direction of growth for given years. The blue lines are the approximate line that a careful core sample would follow.

  51. Is there no better (for these purposes) proxy series than tree-rings? One that does not the manipulation and massaging that the rings seem to require? Are the rings indications wrt temperature beyond the recall of more sophisticated analysis?

    From what I read, and marginally understand, it looks as though a case could be made that the rings are a dead end and cannot be teased into any reliable revelations. Is such a proof possible?

  52. Craig,

    In the same issue that you published your article on the Divergence Problem Jan Esper and David Frank discuss Divergence Pitfalls. They don’t discuss how important these pitfalls may have been in previous reconstructions. So I’m wondering if you or anyone else has done an analysis try to quantify this effect, or if this is something that is at all reasonable to to.

    Divergence pitfalls in tree-ring research
    Climatic Change (2009) 94:261–266

  53. Has anyone ever tested how sensitive is the goodness-of-fit to changing values in the weighting vector.

    What I’m thinking is: if the proxy data is pure temperature signal then there are an infinite number of weighting vectors that would yield a decent fit. You could perturb, then fix, one (or several) elements in the weighting vector, but still get a good fit by readjusting the others. On the other hand, if there is no temperature signal – just noise – then the algorithm will converge to a particular combination of (orthogonal) noise. In this case, when one (or several) elements are perturbed the fit will detriorate. Rapidly.

    If it could be shown that the goodness-of-fit to the instrumental data was highly sensitive to the weighting vector, wouldn’t that prove that the method was only fitting noise.

  54. #61, Check out the posts linked in #35. It should answer your questions about the level of signal and noise in the proxies as well as different weightings.

  55. Jeff ID, thanks for getting this post started. Since I have, over the past weeks, decided it was time I got my feet wet doing some actual PC analyses this thread comes at an opportune time for me. I have started looking at station temperature data from 5 X 5 degree grids.
    There are plenty of caveats in any tutorial dealing with principle component analysis and they fit well with an excellent thread that Ryan O had some weeks ago here at TAV on interpreting the PCs from PCA and determining how many to retain. The discussion on that thread pointed to the obvious benefits of using PCA for such processes as word compression and facial recognition. But in those applications the end result is rather straight forward and testable. What makes my suspicion index bristle is when I see a process used by climate science that introduces a potential for subjectivity and PCA fits that description. If one has honestly constructed an a prior criteria for choosing PCs that would allay some of that suspicion. Unfortunately this is not the case I see with many applications using PCA in climate related works.

    One aspect of PCA that I did not see discussed in this thread was the use of rotation, either orthogonal or oblique, to obtain a better physical picture than obtained from the PC. That’s my next step in my analysis. I have always been told that physical process variables are seldom completely orthogonal.

    Changing the subject here, I wanted to entertain the claim made by some dendroclimatologists – Rob Wilson comes to mind – that tree ring (width and/or density) responses to volcanic events, that were recorded independently in the deep historic past, provide evidence that the tree rings are uniformly responding to that same event in the past with the unstated assumption that the event abruptly changed the climate – also inferred that the event probably changed temperature the most. They will show tree ring responses from several reconstructions/proxies where one sees blips essentially at the same time. They do not, however, seem to think it is necessary to quantify that response. I would tend to agree that large volcanic events can influence tree ring responses and apparently over a relatively large population trees, but when I look at the blips I see some large ones and some relatively small ones which in my mind plays directly opposite to the claim made by the dendroclimatologists. But all that is from a visual realization and not from a formal statistical analysis on my part. Neither have I found a quantitative analysis on the part of the dendroclimatologists.

  56. Craig:
    “your method which they assume works is based on the assumption that the trees which have a good correlation to temp in recent years also show a good relation (and the same relation) in the past–but there is NO proof of this ever given”

    But there CAN be no proof for this.

    “…the same trees that are temp responsive in recent years will become moisture proxies in the past”

    I agree. But in the same way as above, you should prove that there were significantly different moisture regimes at different time periods. And this proof should come from an independent second source.

    Trees are long-lived creatures – they will respond on all timescales to all kinds of things. What is the point of bringing everything into the discussion and tying ourselves in knots? Assuming that selected trees respond predominantly to temperature, we must proceed. It is upto us to ‘extract’ the temperature signal.

    Again – my aim is to trying to understand the IPCC paleo logic, that is all. Please help.

    Thanks

  57. 64: there is data from lake levels and from river outflow (sediment layers in deltas) which independently shows that rainfall have changed a lot over time. So this is known at least qualitatively (ie, not with precision and not monthly, but gross precip HAS changed). You are probably right that there can never be a proof that individual trees grew the same in the past. It is up to the dendros to prove this assumption. For me, since it can’t be proven I toss tree rings out as a useful proxy.

    There are proxies which appear to be more linear (or at least monotonic) such as mineral ratios in plankton sediments or stalagmite layering. These are where I would focus my efforts, not tree rings, but addictions are hard to break.

  58. 64: I am afraid that science is all about “tying ourselves in knots”. If we know that something influences the effect we wish to measure (is confounding) we must either account for it or control it (as in a laboratory experiment). We can’t just ignore it because it is complicated.

  59. One last thought on Mann-ian math and then I’ll let it go (my head hurts)

    Reading your posts, I see that you have shown that the proxie data can be combined to approximate a variety of “temperature data”. That certainly suggests the proxie data has lots of noise and that Mann’s method is ambiguous.

    But I’m thinking the following:

    If the proxy data were pure temperature signal, then they would be (more or less) linear combinations of each other, and there would be an infinite number of weighting vectors that would give an equally good approximation of the instrumental temperature data. The weighting vector will be non-unique.

    If the proxy data are just noise, then the instrumental temperature can only be approximated using the unique weighting vector determined by the algorithm.

    So, it seems to me that if you can show that the weighting vector Mann used is unique, you’ve proven that it is fitting noise, not signal.

    …and at this point, I really will shutup.

  60. Re: kdk33 #67,

    I stumbled on this William Briggs post from just after Mann08’s publication. He makes a related point, I think. Briggs claims that by smoothing proxy time series and then subjecting them to further analyses, one risks finding trends in the data and then over-confidently fitting to them. That seems like a valid concern to this lay reader (Gavin Schmidt notes in the comments, “What rot.”)

  61. #67

    You raise a good point here. I think that Jeff is pondering along the same lines. IOW, is there a mathematical signature in the solution which valid temperature proxies must satisfy?

  62. #67 I think that the posts I linked above are a clear answer to your question about weighting. If we can make any shape we want with the proxy’s, we’ve shown that the weighting vector can change the results in extreme fashion. In the case of CPS, the weightings are binary (1/n’s) and 0’s.

    “So, it seems to me that if you can show that the weighting vector Mann used is unique, you’ve proven that it is fitting noise, not signal.”

    So my answer is yes, I think you’ve got it and that’s what the posts above show. Although, I would say ‘primarily fitting noise’, is more appropriate.

  63. Also, my first posts on the hockey stick, I used least squares to fit proxies to temperatures with analog weightings and not surprisingly got the same sort of results.

  64. Carrick (msg 57)said:

    >David Starr, remember that they try and pick tree proxies that are “temperature limited”. This involves site >selection as well as species selection.

    I don’t believe I have ever heard of temperature limited trees. Are these hardwoods? softwoods? Maple? Birch? Spruce? Pine? Oak? Plywood? MDF?

    >For example a tree on a treeline might look temperature limited in its growth pattern.

    The tree line. Around here tree line is about 4500 feet above sea level. At that altitude it just gets too cold, too windy, too short a growing season, too much rock and too little soil for trees to survive. If tree ring analysis meant anything, a tree from the tree line would indicate really cold temperatures all year round. That might give you the flat base of Mann’s hockey stick, (cold all the time) but most of us don’t consider temperature at 4500 feet representative of anything more than weather on the high mountains is bad, all the time. We have a 6000 foot mountain summit up here that has the worst recorded weather on earth. Like 200 mph winds, 60 below zero, snow on the ground from September to June.

  65. “Trees”, even at the treeline, are not being used as thermometers in these methods. Out of millions of potential trees (alive and dead) that they could take from, and the tens of thousands of cores taken, the number of cores used in these analysis usually maxes out in the low hundreds, often only in double figures.

    From these “proxies”, the trees containing the “signal” can usually be counted on one or at most two hands.

    That’s not surprising, really. Even if you can filter out all the other factors affecting tree growth (which you can’t), you’re looking for a tree that has ignored it’s local conditions and grown in accordance with an average of temperatures recorded from all over the world, then blended together, weighted, and averaged. Only a magic tree has the ability to respond to other temperatures from all over the world instead of the local temperatures it actually experiences.

    Dendro-climatology is little more than the art of cherry-picking in the confirmation-bias orchard.

  66. 73. Using tree growth as a proxy for climate conditions is reasonable as long as you are not assuming a strict correlation between “climate” and specific variable such as temperature. Various agencies in the western US have used dendrochronological data and siting information to investigate historic rainfall patterns. It is clear for instance that the stumps of dead pines in the bottom of stream channels could only grow in such a location while stream flows were very limited. Too much water and the tree either drowns or washes away and yellow pine prefers to keep its ankles dry, so to speak. In the Californian Sierra Nevada there is evidence of prolonged droughts that lasted decades if not centuries. As others have said in this thread, the situation can be much more complicated. Rainfall may correlate either directly or inversely with temperature. It may correlate differently at different times, for BCP data especially this is an issue, since studies of sedimentation in the Mojave Desert document highly variable patterns of pluvial lake inundation and duration. BCP are extremely long-lived trees and they experienced the same extremely varying climate conditions that created the sedimentary patterns of the Mojave, where every combination of hot-warm-cool-cold and wet-damp-moist-dry and seasonality of rainfall have occurred.

  67. Dear Jeff,

    I saw, that M. Mann pointed you towards some Pseudoproxies in a discussion at RealClimate.
    He claims that his proxi studies were tested with “RegEM with TTLS” and CPS.

    Which confuses me a little bit. Isn’t part of his method to use decentered PC (I just looked up CPS: https://noconsensus.wordpress.com/2009/06/20/hockey-stick-cps-revisited-part-1/ as far as I understand that article it is based on decentered PC = you give proxies which follow a certain trend a high weight??)? And what exactly is TTLS?
    Last not least I would guess that there is a difference between the data he used (for example in his 2009 paper) and the pseudo proxies . . his hockey stick is usually made by a few “outstanding” proxies, whereas I would guess in order to “sucessfully” test a centered PS he used virtually identical pseudo proxies which all show a “low frequency trend”.

    Thanks a lot for your blog, it helps a lot to understand things, all the best,

    LoN

  68. #75 I saw that too.

    There is a bit of confusion on different hockey sticks. Actually, there are several methods for making them- and they nearly all stink. That’s what gets people who’ve studied them started dropping f(fraud) bombs. CPS doesn’t use principal components, ttls is truncated total least squares which is a simultaneous multivariate regression that uses a limited number of PC’s to prevent extreme overfitting.

    I’ve not investigated why his pseudoproxies don’t cause variance loss yet, it’s quite baffling at this point having not looked at them. There are only a few rare instances where I can imagine a pseudoproxy working with any of these methods. One is where all proxies have a strong signal as these do with a very different frequency of noise on them – not sure if this is the case.

  69. Dear Jeff,

    thanks for your answer and sorry for my “mashingeverythingtogether”-question.

    Anyhow, the way I understand it, CPS and MDPC (Mannian decentered PC) seem to have a very similar effect: CPS “favors certain proxies, which match a certain predetermined trend, your older post nicely shows that. Mannian DPC normalizes the proxies assuming a trend in the last century and gives proxies which shows such a trend a high statistical weight…
    If I understood you correclty TTLS uses PC’s (I guess Mannian decentered!?) and then picks a “random” number of signifikant PC’s in order to prove the trend. Steve and you had a number of posts on that, that if let’s say PC=3 might have an upturn, the statistical weigth of that will be quite insignificant compared to PC=1 and 2 . .
    There is the question what is the meaning of such a trend in a PC=X, if you change X for each proxitest in order to find a desired trend.

    No doubt you are aware of Steve’s work with pseudo-proxies, where he shows that Mann’s 1998 method only shows the same trend as a centered method, if almost all proxies are identical and contain the trend. As soon as you have many proxies without a significant trend and you sprinkle a few “wrong” proxies among them, MDPC will give a different result.
    I bet my 20cents that the Manns pseudo-proxies all show the same trend and you and Steve already covered that in earlier posts/publications:
    In this case decentering will not show a difference to standard method.

    BTW: I feel a bit silly for writing these things you already know, but please be patient with me . . in best case you aggree more or less to my writing, which means I got it right or you might be willing to correct my statements above in which case I will try to follow and learn more . .

    All the best in any case (and thanks again!!),
    LoN

  70. Mannian decentered pc is a very unique case which was only used in his 99 and I think 98 papers. After that the practice stopped because despite all the claims to the contrary. IMO the entire science realized it’s crap but won’t say it outright.

    TTLS, TLS, RegEM and other methods fit information from proxies to the temp curve. Data which doesn’t fit ends up deweighted or flipped over to provide the best fit. CPS simply eliminates data which doesn’t fit. There isn’t much difference between any of these methods in my mind. Eliminating or deweighting have about the same effect. Flipping of proxies is equally valid to eliminating from a science perspective.

    All of them preferentially select noise of the ‘correct’ shape to fit to the curve. Since the curve being fit is temp data, noise which normally cancels ends up being deweighted and the alleged ‘signal’ portion is really the result of selective noise filtering, guaranteeing an unprecedented whatever you want in the calibration range.

  71. multivariate testing and digital optimization insiderconversion rate optimizationweb designweb designgreat designeri designprint designgraphic designdesigner for hiregraphic designerdesigner for hirei designgraphic designweb designweb design says:

    The benefits of a more holistic optimization and marketing approach can be extended beyond content used to attract leads and sales. There other types of search that can drive or benefit business including customer service related content, job listings and news content. Each has it�s own audience to consider and therefore, a different context for optimization.

Leave a reply to Curt Cancel reply