Hide the Decline – Howto

There are a lot of interesting emails.  This one is worth calling attention to due to the popularity of hide the decline of Climategate 1 fame.  It’s my bold in the middle.  They chopped off the data and infilled it with temperature data.  This is slightly different than Mann08 but it is to the same effect.  Chop off the series and infill it with preferred data.  In this case though, Tim Osborn says it “may not be defensible”.

Let me just tell you kids, don’t try this trick to hide the decline on your high school science paper.

date: Mon, 16 Oct 2000 22:54:31 +0100
from: Tim Osborn <T.Osborn@uea.ac.uk>
subject: progress
to: k.briffa@uea.ac.uk, p.jones@uea.ac.uk

Hi Keith & Phil  (a long one this, as I have an hour to kill!)

We’re making slow-ish progress here but it’s still definitely v. useful.  I’ve
brought them up-to-date with our work and given them reprints.  Mike and Scott
Rutherford have let me know what they’re doing, and I’ve got a preprint by
Tapio Schneider describing the new method and there’s a partially completed
draft   paper where they test it using the GFDL long control run (and also the
perturbed run, to test for the effect of trend and non-stationarities).  The
results seem impressive – and ‘cos they’re using model data with lots of
values set to missing, they can do full verification.  The explained
verification variances are very high even when they set 95% of all grid-box
values to missing (leaving about 50 values with data over the globe I think).

In fact the new method (regularized expectation maximization, if that means
anything to you, which is similar to ridge regression) infills all missing
values (not just in the climate data), which is interesting for infilling
climate data from climate data, proxy data from climate data (see below).

As well as the GFDL data, they’ve also applied the method to the Jones et al.
data on its own.  The method fills in missing temperatures using the
non-missing temperatures (i.e., similar to what Kaplan et al. do, or the
Hadley Centre do for GISST, but apparently better!).  So they have a complete
data set from 1856-1998 (except that any boxes that had less than about 90
years of data remain missing, which seems fair enough since they would be
going too far if they infilled everything).

We’re now using the MXD data set with their program and the Jones et al. data
to see: (i) if the missing data from 1856-1960 in the Jones et al. data set
can be filled in better using the MXD plus the non-missing temperatures
compared to what can be achieved using just the non-missing temperatures.  I
expect that the MXD must add useful information (esp. pre-1900), but I’m not
sure how to verify it!  The program provides diagnostics estimating the
accuracy of infilled values, but it’s always nice to test with independent
data.  So we’re doing a separate run with all pre-1900 temperatures set to
missing and relying on MXD to infill it on its own – can then verify, but need
to watch out for the possibly artificial summer warmth early on.  We will then
use the MXD to estimate temperatures back to 1600 (not sure that their method
will work before 1600 due to too few data, which prevents the iterative method
from converging), and I will then compare with our simpler maps of summer
temperature.  Mike wants winter (Oct-Mar) and annual reconstructions to be
tried too.  Also, we set all post-1960 values to missing in the MXD data set
(due to decline), and the method will infill these, estimating them from the
real temperatures – another way of “correcting” for the decline, though may be
not defensible!

They will then try the Mann et al. multi-proxy network with the new method
(which they’ve not done till now).  They’ve given me the full data set, so we
can do stuff with that later.  I have, I think, all the programs needed for
his old method, so we could still look at that on our own, but he’s not keen
on spending time on that while I’m here.  I’ve swapped it for the MXD data set
(the Hugershoff chronologies, and also the gridded, but uncalibrated, version
of the Hugershoff chronologies).  The gridded stuff was needed for the
reconstruction efforts, because the 387 chronologies would all have had equal
weight and we wanted a simple way to account for clustered groups – the
gridded version that I made seemed the easiest way, even though that is the
Osborn et al. paper that is yet to be written!  What conditions do I need to
place on subsequent use of the MXD chronologies/gridded data?  That (i) we be
informed of what they’re doing with it; (ii) that Osborn & Briffa (and Jones?)
be co-authors on any subsequent papers, and if the MXD dataset provides the
core of the paper, then Schweingruber too?  We will all be on the paper that
comes out of the reconstructions I’ve just described, but I’m thinking about
any future stuff they use it for.

I hadn’t realised (until just now) that their reconstruction program is so
slow that it will take about 3-6 days to run each one!  They have about 6-8
separate processors/machines so we need to get them all various runs going on
at once.  Even so, results are unlikely to be available to Friday morning (I
leave Friday midday), or after I’ve got back home – this looks like being an
ongoing thing!

No need to reply to all/any of this – just thought I’d bring you up-to-date
while I had some time to spare.

Cheers

Tim

71 thoughts on “Hide the Decline – Howto

  1. Jeff,
    Again I can’t see the excitement here. He’s trying out a new method to see what it can do. That’s what scientists do. There’s no indication that he published anything based on what you’ve bolded.

  2. Nick,

    First, I don’t think I am excited. The point though is that they are as aware as you or I of the impropriety of the method. This is exactly what Mann08 did BTW, using other proxies.

  3. I finished my 1st and 2nd degrees in the late 60s and early 70s and then doctorate in the mid 80s, with lots of industry work in between and sweated the hard yards on every one of them. With this background, I have now come to the conclusion that what we are dealing with here is:

    (1) a small and highly exclusive elite of ‘conventional’ , well-educated older scientists on a mission to prove a notion that has already become firmly fixed in their own minds; but unfortunately

    (2) served by an army of younger, willing, more recent graduates ‘well educated’ in the post-modernist style who have no difficulty whatsover with the bizarre proposition that ‘reality’ is whatever you can make it by an act of will e.g. by the manipulation of facts and data (or non-data, ahem).

    Add to this the thoroughly antediluvian environment (in the age of the Internet and mobile ‘phone) of a secretive, paper- based scientific literature run by coteries of editors beholden to none and coteries of referees answerable to no one and you have the ultimate farce.

    Only one thoroughly contemporary question remains:

    Will the dictators fall and true democracy come to climate science?

  4. Being generous, this is clear confirmation bias – how can the data be selected to give the answer we expect? This was is no good because it can’t be scientifically justified.

    I wonder if people are being too judgemental with their accusations of malicious intent – having read the emails from one of the warmists posting here (maybe tfp, can’t remember now) it was fairly evident that he had a genuine conviction that without his personal support which he offered to the team, the oceans would be boiling whilst he was still around to watch. So the scale of alarmism has many different shades, and it is a self-perpetuating monster. The real enemy of the science is the confirmation bias, which seems only to be encouraged by the money flow.

  5. Nick doubles down on deleting inconvenient data. I guess Hockey Teams need cheerleaders, no matter what corruption they are excusing.

  6. #5 In most of the cases I think you’re right. I’m very tired of the whole thing now. It’s truly depressing. The emails are full of extreme-left garbage, huge amounts of government money, battles for funding and expensive trips around the world. Then there is the discussion of endless data which I’m now familiar with and I know with certainty that they are overconcluding. These guys know what is right in climate as well as what is best for us and they will stop at nothing to save us from their own demons.

    I’m really tired of it now. I’ve read hundreds of stupid emails and as I eluded to above, I’m having a hard time giving a shit what these jerks say.

    Hell, there is one email where a guy is retiring and they are deciding whether 10,000 dollars is a reasonable amount to spend from DOE on his retirement dinner. For christ sakes, pay your own dinners jerks, not my money. Whatever, I’m just an engineer with my own company who has to work for a living. I’ve saved more CO2 than these dolts ever will but there’s no safety net for me if we don’t make enough money. There is even an email about intentionally radicalizing climate science.

    Then they sit around and discuss how to chop off data which doesn’t do what they want and replace it with better data as though it might somehow be remotely reasonable. I hope they feel some shame because they should. I do know better though.

    What a crazy bunch of freaks.

  7. #6
    “Nick doubles down on deleting inconvenient data.”
    Bruce, you have no idea of what he is doing. He’s describing it carefully and properly. He has two data sets which give different temperature behaviour. Very common. There are many ways you can try to produce a combined model. He’s just come across a new one, and he’s trying it out.

    To see how it works, you’d always check the extremes, and that’s what he’s doing. As he says, it’s probably not a defensible estimate of true temperature, and there’s no indication that he’s ever used it as such.

  8. Nick,

    “may not be defensible”

    Under what conditions would it be defensible? Let me help with this — none, never, nada. There is no purpose to attaching fake data on the end of a series but one.

    The whole concept is an insanity by itself. It’s NOT checking extremes and the process has been done by others to this exact series in publication. That is the problem. They talked themselves into scamming the data to fit their beliefs that these scribbles actually are temperature.

    Which they are not.

  9. Jeff #9,
    It isn’t fake data. They are both proxies for temperature.

    You did this with Antarctica. You had satellite data and station data, and you used Regem and variants to produce a composite signal. And I bet you tried the algorithm with each of the datasets by itself over periods, eg post 1980. At least, I hope you did.

  10. “They are both proxies for temperature.”

    Well I’m pretty sure that thermometers are proxies for temperature but how are you so certain that trees are?

  11. give it up Nick…after all this time it becomes embarrassing and people will start to think that you are actually stupid. I do not think you are stupid but you are starting to do a good impression. Running interference for the team will do you no good.

  12. This bit in the first para caught my eye “‘cos they’re using model data ”

    Does this mean what it sounds like that they are using output from a model as temp data from which to model temp?

  13. I don’t think Nick is stupid. I think he genuinely favors dishonest and misleading “science” if it helps the “cause” along. He is a perfect cheerleader for the Hockey Team’s dishonest and misleading “climate science”.

  14. Nick, the simple question remains. If trees after 1960 are not used because (in their opinion) are not giving a proper temperature signal, there is no way to ever assert that they are. The fact that some trees agree with instrumental records, some times is indicative of nothing. The mannomatic method has produced temperature reconstructions going back what, 1,000 and even more years? Trees either describe temperatures or they don’t. Sometimes doesn’t cut it, and the fact that the period for which we have by far the best data to check indicates they don’t should eliminate dendro as a proxy for temperature.

  15. Suppose I did a drug trial for a new heart drug, and took the blood pressure of people on the trial, every week.
    There would be missing data every time a patient died; no worries, I could always just in fill.
    Would that be O.K. Nick? You think drug companies should be allowed such practices?

  16. Peter,
    The utility of treerings as temperature proxies has been debated for well over 30 years. Mann didn’t invent it. And the question of divergence has been discussed in the literature for at least 15. Whatever else it is, it isn’t a climategate issue.

    DocM, I really have no idea what your point is. A very large part of science is about applying data from places where you can reliably collect it to places where you can’t. That’s what laboratories are about, for example.

    A classic case is O’Donnell Lewis et al. Antarctica has a lot of (mostly satellite) data post 1980, not much before. So make use of the post-1980 data to make a good model to illuminate the earlier period.

  17. Nick writes “the question of divergence has been discussed in the literature for at least 15”

    Discussed but not resolved. Trees just dont make good thermometers. And certainly not ones that could be used to recreate a global temperature anomoly in the tenths of a degree range.

  18. Nick Stokes@18, see below. Notice that a Prof of Tree Physiology/Biochemistry Forestry & Environmental Management takes exception to the use of trees as thermometers. This issue was discussed extensively at Climate Audit long ago and the conclusion was the same. In fact, all you have to do is what I did, ask the local tree service. My guy laughed and said that tree rings mostly reflect how much water a tree gets. What’s more, if the roots on the north side get more water than the roots on the south side of the tree, the north side tree rings will be fatter. Tree rings aren’t and have never have been thermometers.

    As for Mann’s teleconnections, even though you didn’t ask, they’re ludicrous. What physical mechanism makes trees respond to “global average temperature anomalies” rather than the local temperature? Also please explain how six trees spread over a few acres in the Sierra Madres can represent global temperature. Would you believe that statement if the trees were replaced with thermometers?

    For your entertainment, Ed Cook versus RA Savidge, PhD, Prof of Tree Physiology/Biochemistry Forestry & Environmental Management. (By the way, does anyone know who Ed Cook is?)

    3219.txt:

    from:REDACTEDEd Cook
    subject: Re: Fwd: History and trees
    to:REDACTEDREDACTED

    Rod’s comments are remarkably ignorant and insulting. I suggest that
    he stick to what he knows best and not claim that he understands
    dendrochronology and its methods. That way he would not sound so
    stupid. To suggest that dendrochronology does not embrace the
    scientific method and is as biased as he claims verges on libel. Of
    course, Rod has the right to his opinion. It is just a shame that he
    chooses to expose his ignorance of dendrochronology in such a
    negative way.

    >To the Editor, New York Times
    >
    >Further to the message below, I want to assure you that not everyone agrees
    >with the representations by David Lawrence. As a tree physiologist who has
    >devoted his career to understanding how trees make wood, I have made
    >sufficient observations on tree rings and cambial growth to know that
    >dendrochronology is not at all an exact science. Indeed, its activities
    >include subjective interpretations of what does and what does not
    >constitute an annual ring, statistical manipulation of data to fulfill
    >subjective expectations, and discarding of perfectly good data sets when
    >they contradict other data sets that have already been accepted. Such
    >massaging of data cannot by any stretch of the imagination be considered
    >science; it merely demonstrates a total lack of rigor attending so-called
    >dendrochronology “research”.
    >
    >I would add that it is the exceptionally rare dendrochronologist who has
    >ever shown any inclination to understand the fundamental biology of wood
    >formation, either as regulated intrinsically or influenced by extrinsic
    >factors. The science of tree physiology will readily admit that our
    >understanding of how trees make wood remains at quite a rudimentary state
    >(despite several centuries of research). On the other hand, there are many
    >hundreds, if not thousands, of publications by dendrochronologists
    >implicitly claiming that they do understand the biology of wood formation,
    >as they have used their data to imagine when past regimes of water,
    >temperature, pollutants, CO2, soil nutrients, and so forth existed. Note
    >that all of the counts and measurements on tree rings in the world cannot
    >substantiate anything unequivocally; they are merely observations. It
    >would be a major step forward if dendrochronology could embrace the
    >scientific method.
    >
    >sincerely,
    >RA Savidge, PhD
    >Professor, Tree Physiology/Biochemistry
    >Forestry & Environmental Management
    >University of New Brunswick
    >Fredericton, NB E3B 6C2
    >

  19. 21 Paul Lindsay – you really shold get a better grip on major players.
    Edward R. Cook, Ewing Lamont Research Professor, Lamont-Doherty Earth Observatory, Biology and Paleo Environment, Director of Tree-Ring Lab.
    Also noted for work in Australia (Tasmania) on Huon Pine, Lagarostrobos franklinii, which has many propeties suited for dendrochronology but lacks good calibration data for dendrothermometry because of paucity of close and reliable weather stations.

  20. Trees are not reliable temperature proxies. It has been debated and never resolved. I remember a quote (sadly I cannot find it) from a guy who is an authority on tree rings (in Ireland I seem to remember) and he basically said using tree ring data for establishing temp trends was madness.

  21. Nick #18: “Peter,
    The utility of treerings as temperature proxies has been debated for well over 30 years. Mann didn’t invent it. And the question of divergence has been discussed in the literature for at least 15. Whatever else it is, it isn’t a climategate issue.”

    Nick you are countering arguments I didn’t make. I can’t imagine how the debate could go for 30 years, given that treemometers don’t work in the period for which we have the best data. That should end the debate. The climategate issue is the intent to deceive in the IPCC graph, by welding temperature data onto proxy data, and trying to hide the fact. That is confirmation bias at it’s worst, and it is shameful and dishonest scientific practice.

  22. UEA’s ‘strategic alliance’ with Goldman-Sachs:

    http://www.ecowho.com/foia.php?file=4092.txt&search=Goldman-Sachs

    date: Mon, 18 May 1998 10:00:38 +010 ???
    from: Trevor Davies
    subject: goldman-sachs
    to: ???@uea,???@uea,???@uea

    Jean,

    We (Mike H) have done a modest amount of work on degree-days for G-S. They
    now want to extend this. They are involved in dealing in the developing
    energy futures market.

    G-S is the sort of company that we might be looking for a ‘strategic
    alliance’ with. I suggest the four of us meet with ?? (forgotten his name)
    for an hour on the afternoon of Friday 12 June (best guess for Phil & Jean
    – he needs a date from us). Thanks.

    Trevor

    ++++++++++++++++++++++++++
    Professor Trevor D. Davies
    Climatic Research Unit
    University of East Anglia
    Norwich NR4 7TJ
    United Kingdom

    Tel. +44 ???
    Fax. +44 ???
    ++++++++++++++++++++++++++

  23. If trees are reliable temperature proxies then why do studies rely on teasing the climate signal out from a handful of trees? Why do people like Mann need to invent hockey stick mining algorithyms? If trees were reliable temperature proxies we’d have hundreds of thousands of samples from trees around the world telling us the same thing and matching up with the full instrumental record. We don’t.

  24. Re: Nick Stokes #18,

    DocM, I really have no idea what your point is [re: an analogy of testing a heart drug in #17]. A very large part of science is about applying data from places where you can reliably collect it to places where you can’t. That’s what laboratories are about, for example.

    One of the elephants in the room that the Osborn-to-Briffa email raises is post hoc analysis.

    AFAIK, no Mainstream climate scientist or advocate has demonstrated that they understand the concept, and its implications as far as the standard methods of paleoclimate reconstruction.

    From prior threads at the Blackboard and CA, I certainly include you in this group, Nick.

    Getting back to #17,if post-hoc analysis as practiced by mainstream climate scientists is as powerful and rigorous as claimed, then the FDA is entirely wrong in its insistence that pivotal Phase 3 drug and device trials must be prospective analyses with predefined endpoints.

    Which is it?

    The most common Mainstream answers demonstrate that the writer doesn’t understand the issue.

  25. #25
    “The climategate issue is the intent to deceive in the IPCC graph, by welding temperature data onto proxy data, and trying to hide the fact.”

    That is absolutely untrue. The IPCC plot is here. The proxy data is marked in blue. The instrumental data is marked in red. It says, in big letters right on the graph:
    “Data from thermometers (red) and from tree rings, corals, ice cores and historical records (blue).”.

  26. #28
    “I certainly include you in this group, Nick”
    And rightly so. I can make no sense of your FDA analogy. Climate scientists aren’t designing trials on the last millenium’s temperatures. They are in the past.

  27. Re: Nick Stokes #30,

    I can make no sense of your FDA analogy.

    At # 28, I had noted that “the most common Mainstream answers demonstrate that the writer doesn’t understand the issue.”

    It’s not hard to find information on the subject — which is a concern about statistical interpretation and in no way specific to the FDA review process. Wikipedia’s entry is the first thing that comes up on a Google search. Its first paragraph is a good overview.

    Post-hoc analysis (from Latin post hoc, “after this”), in the context of design and analysis of experiments, refers to looking at the data—after the experiment has concluded—for patterns that were not specified a priori. It is sometimes called by critics data dredging to evoke the sense that the more one looks the more likely something will be found. More subtly, each time a pattern in the data is considered, a statistical test is effectively performed. This greatly inflates the total number of statistical tests and necessitates the use of multiple testing procedures to compensate. However, this is difficult to do precisely and in fact most results of post-hoc analyses are reported as they are with unadjusted p-values. These p-values must be interpreted in light of the fact that they are a small and selected subset of a potentially large group of p-values. Results of post-hoc analysis should be explicitly labeled as such in reports and publications to avoid misleading readers.

  28. Ok, Nick, let me try. The graph in the WMO report (not the IPCC report) and some other places shows reconstructions which supposedly show how unusual recent temperatures are compared to past temperatures. The supposed agreement of the proxies with instrumental data and their agreement with each other is what gives the various spagetti graphs their convincing power. To make the graph look “better” (ie more convincing), they chopped Briffa’s data in 1961 in some graphs (because it disagreed) and in other graphs replaced that post 1961 data with instrumental data. They also excluded reconstructions that did not fit the narrative (such as Loehle 2007 which they act like it does not exist, plus others). They also did things to align them better vertically (used a different base period for anomalies for calibration vs for plotting)–the purpose of these “tricks” was to make the graph look more convincing than is merited–that is, to deceive one into accepting the narrative or argument.
    The analogy with a clinical trial would be to fill in data for patients who died to make it look like they lived, and then get FDA approval based on these results.
    Your agument that the trick was never used is simply false.

  29. Nick needs to argue with Muller about whether the trick is appropriate.

    For me, I keep hearing “for Wales?!”

  30. Geoff Sherrington @ 23. Thanks for the info on yet another player.

    “Edward R. Cook, … noted for work in Australia (Tasmania) on Huon Pine, Lagarostrobos franklinii, which has many propeties suited for dendrochronology but lacks good calibration data for dendrothermometry because of paucity of close and reliable weather stations.”

    Once again, correlation is not causation. Unless someone does a set of controlled experiments that grow trees of that species under a realistic variety of conditions there is no convincing evidence that tree rings are thermometers. The presence of weather stations in the neighborhood is not sufficient. There’s a fir tree in my yard that is over 50 feet high. It used to be a stunted little thing about 5 feet high for years. Then, twenty years ago I had to cut down its neighbor, a 100 foot maple tree due to rot. The fir probably will grow to about 75 feet then start growing in diameter. Just looking at tree rings how would you distinguish between a change in temperature and a change in environment?

  31. #33, Craig
    “The graph in the WMO report (not the IPCC report)…”

    I think it would save time if someone here would just make a long list of grievances – 1-999 say. Then someone could just say 171! and everyone could groan or boo or whatever. It would save the tedious business of getting the facts right.

  32. #37
    Jeff, it often sounds like it. This is a typical exchange. A claim about a graph in the IPCC report. Wrong. But what he meant was a graph on the cover of the WMO report. I should have known that.

    Well, I sorta did. But what to do? A lot of people do in fact think that this fuss about that graph relates to the IPCC report. And the IPCC do it with proper care, which earns no credit here. But if there was a numbering, well I could just say – do you mean #128? And a numbering system would help stop the complaints multiplying. Enforce a bit of uniqueness. That would shrink climategate hugely.

  33. nick is really susceptible to Jedi mind tricks you know. His protestations are quite pathetic although do provide much missed comedy value.

  34. Nick,

    The IPCC does little with proper care. Many of the scientists try but they are nearly always very naive people people who don’t recognize the human environment around them. I’ve never written anything here which wasn’t supported or retracted in the face of evidence.

  35. Nick Stokes: “Well, I sorta did. But what to do? A lot of people do in fact think that this fuss about that graph relates to the IPCC report. And the IPCC do it with proper care, which earns no credit here.”

    So Nick, you knew, but chose to obsfucate. Good on you. You may be chosen for the team yet.

  36. Nick,

    Honest question: When you were a kid did you seek out hornet nests and then poke them with a stick for fun?

    Just wondering.

  37. Nick,
    It does not matter which graph it is, or if the caption does/does not describe, or line colors do/don’t show the splice. The question is, why use a splice at all? Why not just stop the graph at the end of the tree ring period?

    Well, I will let Richard Alley tells us (from FOIA2011 email 2999.txt)

    ” Taking the recent instrumental record and the tree-ring record and joining them yields a dramatic picture, with rather high confidence that recent times are anomalously warm. Taking strictly the tree-ring record and omitting the instrumental record yields a less-dramatic picture and a lower confidence that the recent temperatures are anomalous. ”

    That is why it is done – to be dramatic.

  38. Craig Loehle, I think the use of instrumental in red, is itself a trick, to hide the proxies that end earlier underneath a hockey stick temperature chart.

  39. Kan
    “Why not just stop the graph at the end of the tree ring period?”

    Because it’s exactly what people want to know. Two measures of temperature have been created that cover different but overlapping periods. What’s the total picture?

    There is hardly anything unusual about comparing two measures on the same plot.

    And Alley’s statement is just obvious. Tree-ring proxies for the most part don’t extend into the period of greatest warming, and sometimes don’t reflect it well when they do. So they give a less impressive demonstration, on their own. But the thermometer record exists, and is indeed more reliable.

    That’s why it’s done. It really happened. Why try to air-brush it away?

  40. Kan, combined with the 2009 email by Mick Kelly 122502612
    the possibility that we might be going through a longer – 10 year – period of relatively stable temperatures …
    Speculation, but if I see this as a possibility then others might also.
    Anyway, I’ll maybe cut the last few points off the filtered curve before I
    give the talk again as that’s trending down

    So if temperatures are going up, include them for a more dramatic effect. If they are going lower, exclude them.

  41. In this thread on CA, http://climateaudit.org/2011/11/25/behind-closed-doors-perpetuating-rubbish this phrase from Mann is quite telling, imo. It backs up his seeming incomprehension that an upside-down series could matter.

    so if a series is truly crap in an objectively determined sense, it got very low weight.

    Does this really show that he has so much trust in the statistics as to imagine that any garbage series that correlates well with temperature makes a good proxy? I can kind of see that he is right in some cases, for example in a fourier transform any sine-wave which correlates at all is part of the proxy. Similarly, a straight line fit on a scatter plot might often have some predictive value – and that sort of thing gets fairly strong presence in undergrad physics courses (and near enough zero on significance).

    Particularly going back 15 years when this use of stats in climatology was a bit more cutting edge, I can imagine that it could be a justification for pulling in any series with the slightest hint of physical justification – in the expectation that the magic of regEM would sort out the mess. This flawed foundation becomes the foundation of confirmation bias, some (others) start feeding in money for their own personal ends, and the process starts to snowball – without necessarily any malicious intent on the science side.

  42. Tree-ring proxies for the most part don’t extend into the period of greatest warming,

    This is totally untrue. Unfortunately Nick’s defend-at-all-costs results in him making statements without any foundation.

  43. I had as boss who was the head of a laboratory in a large iron foundry back in the 1970s.

    There was on particular wet analysis test that had to be undertaken on a particular substance in the finished cast iron. It was notoriously hard to do and my boss had a special correction factor of -I think- 2. If the result was to low, she would add 2. If it was too high, she would take 2 off.

    The only difference between my then boss and the global warming scientists was that she had no qualifications in science. In all other respects, she seems very similar to them!

  44. MikeN–yes, thick red line for instrumental is also a trick
    I am very curious what Nick thinks the purpose of the spagetti graphs is (and yes, in the IPCC graphs Nick, there is also monkey business, like only including graphs they like and end padding/reflecting): it is not to show recent trends but to try to convince that the proxies are adequate to reconstruct temps back 1000 yrs or more. Proxies that diverge don’t meet this criterion.

  45. Nick Stokes:

    Tree-ring proxies for the most part don’t extend into the period of greatest warming,

    Steve McIntyre:

    This is totally untrue. Unfortunately Nick’s defend-at-all-costs results in him making statements without any foundation.

    To demonstrate how totally untrue this is, here is a link to the data for Mann 08 paper.

    There are 920 tree-ring series in this data set (id “9000”).

    Of 920 tree ring proxies, all 920 extended to the year 1998.

  46. BTW, there are 7 “series 9000” proxies listed in the meta data that I didn’t find a corresponding file for. These are:

    arge087.ppd
    arge088.ppd
    arge089.ppd
    arge093.ppd
    norw7.ppd
    russ191.ppd

    I find 927 tree ring proxies in the 1209proxynames file and 920 corresponding tree-ring data files.

    Anybody who wants to verify this can do so by downloading these files:

    http://www.meteo.psu.edu/~mann/supplements/MultiproxyMeans07/data/proxy/allproxy1209/Readme.txt
    http://www.meteo.psu.edu/~mann/supplements/MultiproxyMeans07/data/proxy/allproxy1209/A_allproxy1209.zip
    http://www.meteo.psu.edu/~mann/supplements/MultiproxyMeans07/1209proxynames.xls

    and unzip A_allproxy1209.zip.

    SteveM also has these converted to R-language format for people who use that. They are located in

    http://www.climateaudit.info/data/mann.2008

  47. Carrick,

    Your finding of gonzo data in Mann08’s documentation doesn’t surprise me. Beyond what Steve McI has reported for Tiljander-in-Mann08 — contaminated, uncalibratable, upside-down — here are a few more:

    * There are two Tiljander data series, and two other series calculated from them. Mann08 didn’t recognize this, and treated them as four independent series.

    * The final varve that Tiljander03 reported on was dated 1985. To cover the full calibration/validation period, Mann08 used RegEM to infill values through 1996. “Infill” is fancy talk for extrapolate. Thus, a considerable portion of the “late” calibration and validation periods rely on made-up numbers.

    * The varve assigned to the year 1326 was anomalously thick, an order of magnitude more so than were the varves for immediately preceding and subsequent years. Mann08 replaced the Tiljander03 values for 1326 with (1325 + 1327)/2. This alteration is undocumented, as far as I know.

  48. Thanks AMac. I don’t find 7 missing series to be a big problem, at least for what I want to use it for.

    It’s certainly a much smaller mistake than erroneously claiming “tree-ring proxies for the most part don’t extend into the period of greatest warming”.

    Truthfully, compared to how the Brits treat data as state secrets, Mann’s release of his data is like letting me sleep in the spare bedroom, giving me a key to the house, and free range of his house including kitchen and swimming pool.

    This by the way:

    * The varve assigned to the year 1326 was anomalously thick, an order of magnitude more so than were the varves for immediately preceding and subsequent years. Mann08 replaced the Tiljander03 values for 1326 with (1325 + 1327)/2. This alteration is undocumented, as far as I know.

    has been given the moniker “extirpolation.”. I’m sure it was done manually in this case, but you can set up an automated criterion for rejecting “spike noise” and replacing it with neighboring points.

    They should have mentioned they were employing the technique here, but I don’t see anything wrong about using the technique itself.

    Random aside:

    Extirpolation is a curious example by the way of where a higher sampling rate is better, even if nominally you are already comfortably inside of the Nyquist limit at your current sampling rate. [This curiosity is explained by noting that the usual information theory arguments relating to the sampling theorem are associated with linear operations, whereas .the “threshold rejection method” which extirpolation uses is inherently nonlinear.]

  49. Mann 2008 also cut all of the proxies off at 1998, which does surprise me. Why not extend the data to its existing limits?

  50. Thanks for that on extirpolation. I don’t like it — seems the downstream problems come from averaging, which I am not convinced is the best way to look at these data.

    For fun, I looked at Tiljander, taking the median value for each 20-year period. The reasoning is that rare, unusual events (eg downpours, hurricanes) might cause a surge of sediment into the lake, above a backdrop of slow, steady accumulation. When looked at this way, Lightsum and Darksum look quite similar to each other, and much smoother than is the case with a rolling average. But, as with the data as presented in Tiljander03, there’s no obvious telltale that says “this is a proxy for temperature.” I suspect they aren’t — that they say more about precipitation than anything else.

    Re: 1998, I think Mann08’s authors wanted to perform the same procedures on all proxies, thus uniform screening, calibration, and validation periods. 1998 might have been an arbitrary “recent, but not too recent” endpoint, such that tricks such as RegEM infilling could be minimized.

  51. Amac, extirpolation and similar methods (e.g., median filtering) are very useful when you have “sudden events” that are not representative of the population being studied. It is a completely legitimate methodology when used properly.

    There really are two phases to it:

    1) rejection of noisy data (e.g., threshold rejection, where all data points that are above a preset level are rejected).
    2) replacement of missing points, by interpolated value of surrounding points (I use a cubic interpolation rather than a simple average, but the idea is the same).

    Part #2 is usually only done when your algorithm needs the data to be equally spaced. Part #1 is justified by looking at the envelope of the time series:

    “LH” envelope

    And then plotting the histogram of this:

    “LH” envelope histogram

    In this case, values about about 6000 on this arbitrary scale obviously don’t belong to the population of values you are interested in measuring, and including those samples (or windows of data in this case) is just introducing noise to your analysis, with no benefit.

  52. I should also mention it can even be worse than introducing noise—there can be a nonlinear interaction between the quantity you are trying to measure and this undesirable noise. So not only does it corrupt your signal by increasing noise, it can generate artifacts that corrupt the dataset itself, if left in the analyzed portion of the data.

    Since this whole process can be done in an automated fashion (I use it for cleaning up spectral periodograms), and as long as there is an objective basis for picking the threshold for rejection, I don’t see any problem with that method.

  53. Relating to this comment, it turns out that all 927 data series are present. The id’s in the meta files were wrong.

    Here’s a comparison (left is the id, right is the file name root):

    arge087 arg87ars
    arge088 arg88ars
    arge089 arg89ars
    arge093 arg93ars
    norw7 norw7ars
    russ191 russ191ars
    (missing) tornetrask

    I found this out when I wrote a program that used the data files directly (which contain the type ids for the proxies as well as the lat & long), instead of relying on the metadata.

  54. Re: Carrick (Nov 27 22:39),
    “To demonstrate how totally untrue this is, here is a link to the data for Mann 08 paper.”

    You have given a link to Mann 2008. But this thread is discussing a 2000 email. In Briffa’s list for his 2001 paper, out of 389 sites, 89 got no further than 1978. In total, 246 reached 1988. Only 143 extended into the ’90s.

  55. Jeff, re Hide the Decline, have you read 1645.txt, sent just 5 hours before the famous email:

    date: Tue Nov 16 08:57:47 1999
    from: Tim Osborn
    subject: time series for WMO diagram
    to: p jones

    The age-banded density Briffa et al. series can be got from:
    /cru/u2/f055/tree6/NHtemp_agebandbriffa.dat
    It is ready calibrated in deg C wrt. 1961-90, against the average Apr-Sep land temperature north of 20N. It goes from 1402 to 1994 – but you really ought to replace the values from 1961 onwards with observed temperatures due to the decline.

  56. Carrick #63 / #64 —

    Thanks for the background. However, after thinking about it, I’m even less impressed with Mann08’s procedures.

    I think it has to do with the meaning of the terms “signal” and “noise”.

    extirpolation and similar methods (e.g., median filtering) are very useful when you have “sudden events” that are not representative of the population being studied.

    Let’s look at methods for improving digital mobile phone call quality. In this case, “signal” means “the audio signal received by the sending phone’s mic.” Or more specifically, “those sounds that contribute to increased comprehension by the listener of the words spoken by the sender.” Everything else can be considered to be “noise.”

    In the case of the Lake Korttajarvi varves, we should start by recognizing that just about everything is “signal”. That mineral or organic sediment is present as the result of physical processes in the lake’s watershed. A priori, we don’t know what the equivalent of “audio input” or “spoken-word comprehension” might be, if there is one. Varve sediment composition and thickness could be “signals” of: precipitation, precipitation bursts (mobilizing soil), spring meltwater (same), recent forest fire history (more bare ground), plant cover (certain species will contribute more organic fines to the lake), algal growth in the lake (organic fines), lake level, peat cutting, farming, plowing techniques, road building, lakeside bridge reconstruction.

    Or temperature.

    Any or all of these processes might contribute to the “signal” that is the varve.

    Since we don’t know what the characters of the “signals” are, how can we process the data series (e.g. by extirpolation) to enhance that signal?

    1) rejection of noisy data (e.g., threshold rejection, where all data points that are above a preset level are rejected).

    As above, if “noise” and “signal” aren’t correctly defined, how can we set a threshold?

    Graph of Tiljander03 Thickness at BitBucket (Excel file)

    Mann08 chose to reject the 1326 value of ~19 mm. But they chose to accept two 20th Century values of ~7 mm. Why? What was the procedure that set the cutpoint somewhere between these two numbers? Was it ad hoc, or pre-specified? Since it goes unmentioned in the Methods or the SI (AFAICT), there is no way to know.

    2) replacement of missing points, by interpolated value of surrounding points

    In the case of Tiljander, spacing is annual throughout, so that’s not an issue for 1326.

    What I see is more of the same. Arrogance. Convenient assumptions. A low priority placed on data integrity. Sloppiness. Incomplete and even misleading descriptions of methods, waved through peer review.

    AMac

  57. Carrick wrote: “Of 920 tree ring proxies, all 920 extended to the year 1998.”

    Are you sure? I looked in the Mann 1209proxynames excel file which seems to suggest very few extend to 1998. Have they been padded?

  58. #69. Nearly all were padded by a few which didn’t require it. The process used was RegEM which applied a linear weight to the non-missing values based on their covariance with the series which had missing values. If you look at the decimal places of the raw-ishdata, you can see the infilling.

  59. Let me just tell you kids, don’t try this trick to hide the decline on your high school science paper.

    Heh. You need a doctorate to permit Piling that High and Deep. Even better if you have tenure. Best of all would be membership in the Noble Order of Government Gatekeepers and Grant Givers (NOoGGaGG).

Leave a comment