the Air Vent

Because the world needs another opinion

These are not the thermometers you’re looking for

Posted by Jeff Id on January 28, 2010

Guest post by Gavin Schmidt.

“The idea that we’re fraudulently cutting out stations is appallingly defamatory and ignorant,” he said. “These people are desperate to come up with some hint of impropriety. The allegations are absolutely without foundation and based on profound ignorance.”

Schmidt also said a smaller sampling of weather stations in the Canadian Arctic wouldn’t have a significant impact on the data. He said any long-term temperature changes recorded at the high Arctic station at Eureka, would likely be “representative” of changes elsewhere in the region, even in a sub-Arctic city like Yellowknife.

I didn’t make the claims, but I know some people who did. They tend to be pretty careful about looking at data. If Gavin’s right, why not just keep all the data in the record? –Jeff Id

132 Responses to “These are not the thermometers you’re looking for”

  1. Jeff Id said

    I am profoundly and completely ignorant to any possible rationale that all the data should NOT be aggressively sought and openly collected. Maybe someone can explain that to me.

  2. JimD said

    Given Schmidt’s version of reality, of course, he’s absolutely right – why bother with real data when it only makes more work homogenising/gridding it into what your models require it to say anyway?

    You could always spend the money you’d otherwise waste on rubbish like thermometers and data recorders (perish the thought!) on something more worthwhile.

    Any suggestions?

  3. JAE said

    “He said any long-term temperature changes recorded at the high Arctic station at Eureka, would likely be “representative” of changes elsewhere in the region, even in a sub-Arctic city like Yellowknife.”

    Hmmm. Then, by that logic, why don’t we just use one station, maybe in Los Angeles, to be “representative” of the whole world?

  4. Peter of Sydney said

    Better still why not one at the North Pole and one at the South Pole and average the two?

    Seriously though I don’t mind the reduction in number. It’s the locations of the ones that are used that concerns me. I still think it’s far better and more scientific to exclude all those at or near populated areas (eg, buildings, airports, etc.).

  5. dearieme said

    “would likely be “representative” of changes elsewhere in the region…”: only “likely”? That doesn’t sound like settled science.

  6. Harry said

    If they have nothing to hide then let them show everything (raw data, code) so we can check and verify for ourselves. Why trust them? Have they earned our trust? I do not guess so. The blogs by ChiefIO and Anthony indicate that something nasty is going on, which needs to be investigated. Thoroughly.

  7. Ian said

    The question that still vexes me is:

    Why are they cutting records out? What principles are applied and processes used to determine that record “X” & “Y” are to be cut, but “Z”, retained?

    Are the principles enunciated anywhere? Are the processes defined and set out? Is it possible to look at record “X”, and see the notation that says: on the basis of the following principle(s), this record has been removed because [enter explanation here]. One also would expect then to see an estimate of the impact the removal of this record would have and how they have adjusted for it. That should be available on an easily accessed, country by country basis.

    Does that explanation exist anywhere? I’m confessing profound ignorance on why the number of records being utilized have dropped off the cliff – but Gavin’s little quote does nothing to dispell that. He provides no explanation at all (at least, not in what Jeff has quoted above: is there any more to what he said besides more of the usual wagon circling? hmm…with what’s been going on lately, it makes you wonder: can you circle a single wagon?)

    Sigh…

  8. Don said

    ‘appallingly defamatory’…so the next logical step would be legal recourse to repair the harm done to your good name (and deeds) Mr. Schmidt?? I dare you to put your reputation and work under the microscope.

  9. “I didn’t make the claims, but I know some people who did. They tend to be pretty careful about looking at data. “
    I’m not convinced of that. From the d’Aleo/Watts report:
    “It can be shown that they systematically and purposefully, country by country, removed higher-latitude, higher-altitude and rural locations, all of which had a tendency to be cooler. “
    The reality, as I found when I averaged temperatures by year over the stations in the GHCN data set v2.mean, is that that average is declining. This “purposeful removal” of cool stations produces a cooler average.

  10. Eric said

    Prof. Schmidt seems slow to realize that in the post-Climategate world there is very little trust across the lines of debate. There is no longer an assumption of good faith. That is CRU and IPCC’s one indisputable achievement.

  11. Kenneth Fritsch said

    Why don’t we stop all the specualtion, accusations, counter accusations and assorted other BS and encourage a study with a real analysis of these effects. And quite frankly I do not give a damn what Gavin Schmidt says unless he backs it up with a real analysis.

  12. CoRev said

    Nick Stokes said: “The reality, as I found when I averaged temperatures by year over the stations in the GHCN data set v2.mean, is that that average is declining. This “purposeful removal” of cool stations produces a cooler average.” Compared to what?

    It is supposed to be cooling, so have you just confirmed that cooling? I also thought the V2 version was already homogenized, but I could be wrong there.

  13. Tom Fuller said

    It is my understanding, based on a Gavin Schmidt comment on Real Climate, that the 6,500 figure for measuring stations is somewhat artificial, in the sense that they did not report in real time. A major research project went to many of these stations, copied data off of paper records, and entered it into the database. Because the project was a one-time event, it isn’t realistic to think of those stations as part of an ongoing measurement network.

    That said, I agree with most of what’s written in the comments here. I think you probably could do a good job with 1,500 perfectly sited and perfectly functioning stations. Is that what we have? It doesn’t really seem like it.

  14. AMac said

    Here’s a link to the Gavin Schmidt quote. Fewer temperature reports could mean warming underestimated, scientist says. By Richard Foot, Canwest News Service, January 23, 2010.

    This interesting bit in the article, too–

    However, D’Aleo and Smith’s claims were dismissed in a report published online Thursday by The Yale Forum on Climate Change and the Media. It said the sampling of weather stations around the world has declined in recent years, not because fewer stations are being included in the temperature records, but because of time delays in collecting data from individual stations. Station sampling from the 1970s appears high in number today, only because “those records were collected years and decades later through painstaking work by researchers. It’s quite likely that, a decade or two from now, the number of stations available for the 1990s and 2000s will exceed the (number) reached in the 1970s.”

    Also, Nick Stokes deserves kudos for pulling data, analyzing it, graphing it, and posting his code, on his blog. Linked in #9, supra.

  15. Mike D. said

    Given the tenor of Schmidt’s frequent abusive insults towards commenters on his blog, “appalling and defamatory” are par for the course. $Billions have been spent propping up the AGW hoax, and $trillions more in exactions desired. We the victims of this fraud have every right to be vociferous in our condemnations. If you can’t stand the heat, stay out of Federal Treasury.

  16. #12 Corev
    No, the narrative is that warmer stations are being deliberately selected, or cooler stations eliminated. “As the Thermometers March South, We Find Warmth”. Which is claimed to lead to warming, although with anomalies it shouldn’t. Anyway, the shift is to more cooler stations, not less. Enough to exceed the AGW effect.

  17. Don said

    Nick,

    I beleive it is Jeff saying ‘didn’t make the claims’ – not Gavin

  18. Atomic Hairdryer said

    I still don’t understand the scientific objection to supplying raw data. Master chefs like Schmidt can cook it how they want to create their signature dishes, but I still want to understand what it is I’m eating, and how their cooking will affect my bill. If I don’t like what they’ve cooked, I won’t leave a tip.

  19. Harry said

    Sorry Dr Schmidt, in the world of public opinion, guilt by association is real.
    Anyone who had close dealings with the East Anglia people should resign for the good of the science.
    Just as anyone who was in anyway associated with the “Watergate Coverup” ended up resigning as well.

    It may or may not be fair…but when trillions of dollars of public and private money is ‘riding on the science’, anyone without absolutely impeccable integrity just ends up muddying the water. The entire ‘Hockey Team’ needs to look into a career involving french fry machines and funny hats.

  20. Kenneth Fritsch said

    Station sampling from the 1970s appears high in number today, only because “those records were collected years and decades later through painstaking work by researchers. It’s quite likely that, a decade or two from now, the number of stations available for the 1990s and 2000s will exceed the (number) reached in the 1970s.”

    On Friday Gavin Schmidt, a senior climatologist at the Goddard Institute for Space Studies, said NOAA and NASA don’t choose which stations to include in their database – that’s decided by national bodies like Environment Canada which feed information to the World Meteorological Organization.

    Why would it take several painstaking years to provide raw data to the data set owners and for them then to apply their algorithms to adjust it? Most of the process is automated through use of existing algorithms and quality checks. The comment above would have one to believe that the data has to be subjected to some group of people who slave over the numbers and then make subjective judgements on the data. That is totally not the descriptions of the processes that I have read.

    Why would some stations data be reported in a timely manner and others not. If all stations did these supposed painstaking efforts before processing the data I would guess that we would have a several year delay before we had any temperatures. I absolutely do not follow the reasons given here.

    I have major doubts that Gavin or any others who are semi-responsible to responible people for these data sets could
    cherry pick the stations and primarily because of their non involvement with the station collection processes. I personally think that they are pretty much unaware of the process as it actually operates day to day and in the field and rather spend much of their data set efforts on defending the numbers and process. But that is also why I take Schmidts comments and arm waving about the process with a grain of salt.

  21. Craigo said

    7.Ian said
    January 28, 2010 at 7:05 pm
    this record has been removed because [enter explanation here].

    Suggetions for the drop down box:
    a. it has a cooling trend
    b. it doesn’t matter
    c. it is statistically insignificant after homogenization
    d. it does not support my latest grant application
    d. you are not a peerreviewedpublished climatescientist so you wouldn’t understand
    e. its jolly complicated and i haven’t got time to explain because my plane leaves for Aruba where I have a really really important thing to do
    f. you are not a Team Member so you couldn’t possible understand the sophisticated “tricks” used for selection
    g. we don’t understand where the warming has gone so have ommited this station
    h. i don’t read your blogs so didn’t see the question
    i. i don’t read your blogs but will debunk your hypothsis at RC shortly (and sent in the trolls)

    What have I missed?

    Or we could resort to some old time favourites:
    j. “I hear nothing, I see nothing, I know nothing” (Sgt Schultz)
    k. “I am not the one – the one who is the one is not here” African employee response

  22. boballab said

    #20

    And your right Kenneth especially when you go to the WMO and look up how this is all suppose to work and what the WMO did starting in 1985 to automate it:

    A major step forward in climate database management occurred with the World Climate Data and Monitoring Programme (WCDMP) Climate Computing (CLICOM) project in 1985. This project led to the installation of climate database software on personal computers, thus providing NMHS in even the smallest of countries with the capability of efficiently managing their climate records. The project also provided the foundation for demonstrable improvements in climate services, applications, and research. In the late 1990s, the WCDMP initiated a CDMS project to take advantage of the latest technologies to meet the varied and growing data management needs of WMO Members. Aside from advances in database technologies such as relational databases, query languages, and links with Geographical Information Systems, more efficient data capture was made possible with the increase in AWS, electronic field books, the Internet, and other advances in technology.

    http://www.wmo.int/pages/prog/wcp/documents/Guide2.pdf

  23. Kenneth #20
    “Why would it take several painstaking years to provide raw data to the data set owners and for them then to apply their algorithms to adjust it? Most of the process is automated through use of existing algorithms and quality checks.”

    Gavin is referring to the original GHCN project, of which v1 came out in 1992. I’m sure there was a huge amount of painstaking work. I was myself peripherally involved in an Australian exercise of this kind in about 1979.

    You’ll note that the v2.mean file is about 43Mb. That’s monthly averaged. It’s likely that for at least some of the data, GHCN would have handled daily records. And handling often means actually typing from hand-written or printed records. Typing of course then means checking.

    Then there’s metadata. GHCN decided not to use it for adjustment, but I bet they collected a lot. There would have been low-level stuff – thermometer type, calibration etc. Plenty of work there.

  24. Carrick said

    Nick Stokes:

    The reality, as I found when I averaged temperatures by year over the stations in the GHCN data set v2.mean, is that that average is declining. This “purposeful removal” of cool stations produces a cooler average.

    You need to be careful about how you average, especially since you aren’t averaging anomalies.

    A small shift in average latitude of the stations over time can swamp all other effects.

  25. Genghis said

    Nick Stokes

    Can you show me the examples where they have compared the grid homogenization with actual data from the grid area that wasn’t provided prior to the homogenization?

    It seems to me that this would be the very first test that I would do to verify the accuracy of the homogenization algorithm.

  26. timetochooseagain said

    “why not just keep all the data in the record?”

    Quite literally their excuse is “too much work”. See Nick at 23.

    Frankly, I don’t care how much work it is. IT SHOULD GET DONE! We pay these people quite a bit you know.

    And why is it that every time John Christy takes the time to collect data and produce a long term data set, he not only finds more stations than the official groups use BUT THE WARMING IS ALWAYS LESS? This is in East Africa, Northern Alabama, Central California-he takes the time and effort (JUST HIM mostly!) and in months collects more data and analyzes it (sometimes data which hasn’t even been digitized) than these people do with huge teams payed federal money in YEARS!

    Read the papers anyone?

  27. Carrick #24
    “A small shift in average latitude of the stations over time can swamp all other effects.”

    Yes, but for this exercise that’s the effect we’re looking for.

    Likewise #25 Genghis it’s not checking the accuracy of homogenisation. Just whether it’s really true that there is a shift in stations to warmer circumstances, as WUWT claims.

  28. Gavin: Station sampling from the 1970s appears high in number today, only because “those records were collected years and decades later through painstaking work by researchers. It’s quite likely that, a decade or two from now, the number of stations available for the 1990s and 2000s will exceed the (number) reached in the 1970s.”

    Sounds like a clear confession that what was learned about warming in the 1970s IS NOT KNOWN ABOUT THE 1990s AND 2000s yet, because “the painstaking work” hasn’t been done for those later years.

    “Man, we learned all about the 70s from doing intensive labor that would have killed ordinary men…but we don’t really know shit about the 90s and 2000s because we haven’t done that backbreaking work. Ergo, you can discount whatever we’ve said about the 90s and 2000s so far. It’s just hype and speculation.”

  29. timetochooseagain said

    27-“as WUWT claims”

    Actually, the claim originates with EM Smith but is really annoyingly, stupidly clueless because the shift of stations to lower latitudes does NOT spread the (absolute) warmth of the tropical stations northward-it just down’t work that way. There ARE sampling problems and the data DO clearly exaggerate warming IMAO, but that claim in particular is just shit.

  30. John Norris said

    “These people are desperate to come up with some hint of impropriety …”

    No desperation required. Plenty of impropriety is now visible; well beyond hints. Gavin’s email’s have been very tame, but guilt by association looms large.

  31. POUNCER said

    #29 “the claim originates with EM Smith but is really annoyingly, stupidly clueless because the shift of stations to lower latitudes does NOT spread the (absolute) warmth of the tropical stations northward-it just down’t work that way. ”

    How does it work?

    Smith may be like a magician showing how the girl is sawed in half, while you and Gavin Schmidt are abdominal surgeons. But Smith details the smoke and mirrors hypothesis showing how big a mirror is, how it is titled, how much gunpowder is in the smoke pot … showing data sources, posting FORTRAN code, and discussing the logic and the implementing algorithms. At the end of his prsentation, ANYBODY can duplicate the feat of “sawing the lady in half.”

    The professional climatologists, instead, say: “It doesn’t work that way. Trust us, we’re doctors.”

    Now Smith may be wrong, ans Schmidt may be right. But Smith is doing a much better job of persuading me of his honest intent.

  32. stumpy said

    I think you hit the nail right on the head, many of these stations are still there reporting, they have just been dropped for some reason – why????

    Theres no logic to it unless you simply dont like the data they are recording!

  33. Genghis said

    Nick

    No, the question isn’t the shift of stations. The question is the testing of the models homogenization technique. Are there tests of grid cells where the homogenized projections are compared to actual data. In other words have they left thermometers records out and then filled in the cell using the homogenization technique and compared it to the actual record?

    It seems to me that this would be the minimal required test.

  34. #9 A retraction is needed on my claim that average temperatures of stations in the GHCN set are declining. I discovered a programming error – a memory overflow. In the revised result, the stations are warming over the period since 1950.

    However, as TTCA says (#29), the fact that the selection of sites is getting warmer does not mean that the anomalies will increase.

  35. kuhnkat said

    Nick Stokes,

    as usual you belabor the point of shift to warmer stations. I believe the claim is that MORE stations are included in the baseline period making it cooler than the current so that there is a larger anomaly.

    The real dirt is in the homogenisation and adjustments, loss of true rural stations with little trend, and the gridding with associated smearing of higher trends from warmer stations not in the same general hemisphere.

    By the way, GISS apparently carries the temps through to the final computations, so, we AREN’T saved by the anomaly!!!!

    But then, you knew all of this. Thanks for the apologetics as usual.

  36. kuhnkat #35
    By the way, GISS apparently carries the temps through to the final computations…

    It carries temp through to to.SBBXgrid.f in step 3. At that stage while mapping the data onto grid points it forms the anomalies. So the anomaly is not local to the station, but to the grid point. In doing this it takes data from up to 1200 km away, but with a tapered weight function.

  37. Genghis said

    Nick Stokes
    “So the anomaly is not local to the station, but to the grid point. In doing this it takes data from up to 1200 km away, but with a tapered weight function.”

    Has there been any testing to see if the tapered weight function is accurate with real data? It should be trivial to withhold a stations data, process a grid point with the tapered weight function (on the station) and compare it to the actual data.

    Is there any place I can see the results of that test?

  38. Keith W. said

    Nick #36, so the temperature in Atlanta influences the temperature in St. Louis, even though there is a mountain range and other natural barriers between them. This the problem I have with the 1200 km “fudge factor”. Does the temperature in San Francisco relate to the temperature in Denver? I would much rather see smaller grid cells with additional data points than the supposition that the temperature in Philadelphia is teleconnected to the temperature in Atlanta.

  39. #37 Genghis
    This is just interpolating onto grid points. There’s no independent point you can check against. And remember why grid values are used. It’s to produce the shaded plots that you see, and to compute appropriate area-weighted global and hemisphere averages. It’s an intermediate number – no-one publishes individual grid point values.

    BTW, I see on looking at the code that the interpolation region is variable. There’s a comment on the code which says the range is 1200, but in fact it is RCRIT, which is the resolution chosen. It can be 1200 or 256 km.

    #37 Kevin
    Again, it’s a grid value that’s being calculated, not a site value. And if you ask for 256km resolution, it’s only gathering up to 256 km away (despite the comment I mentioned above).

    It’s not recomputing Atlanta. In fact, the approximation occurs when the temps are aggregated onto the grid points. The approximation involved in then putting that grid data relative to a common anomaly is minor in comparison.

  40. Jeff Norman said

    Jeff,

    The title is hilarious.

    Jeff

  41. Genghis said

    Nick Stokes

    “This is just interpolating onto grid points. There’s no independent point you can check against.”

    I understand that. But if they left out station data when they were doing the interpolating then they would have an independent point to check against. I would expect the interpolation to be very close to the actual data.

    “And remember why grid values are used. It’s to produce the shaded plots that you see, and to compute appropriate area-weighted global and hemisphere averages.”

    Again, I understand. It is the accuracy that I am curious about.

    “It’s an intermediate number – no-one publishes individual grid point values.”

    Surely they have tested their method for accuracy?

    “BTW, I see on looking at the code that the interpolation region is variable. There’s a comment on the code which says the range is 1200, but in fact it is RCRIT, which is the resolution chosen. It can be 1200 or 256 km.”

    Does the range change the results? I was under the impression that the anomaly was invariant or was I wrong?

  42. timetochooseagain said

    31- I’m not a professional climatologist and I’m not asking you to trust me.

    Okay, here goes: If the official warming data were based off of the average of the temperatures observed at all stations, it would look like the figure on page 14 here:

    http://noconsensus.files.wordpress.com/2010/01/surface_temp1.pdf

    Except that it doesn’t look anything like that! You can’t take stations and just average them, climatologists, even the dumb ones, know that will give nonsense results. So they don’t. The first thing they do which eliminates most of the problem arising from such averaging is to subtract the mean for a period, say 1960-1990 or so, from each station. Since most of the latitude effect is that the stations at higher latitudes that disappear are colder in terms of absolute temps, removing their average will eliminate most of the effect of station dropout.

  43. Carrick said

    Nick Stokes:

    Yes, but for this exercise that’s the effect we’re looking for

    If you grid the data the way GISTemp does, you wouldn’t see an effect from the shift in mean latitude. I should have spelled that out, I knew you were aware of that, but you’ve progressed to talking about how GISTemp handles these issues now, so….

    TTCA:

    Since most of the latitude effect is that the stations at higher latitudes that disappear are colder in terms of absolute temps, removing their average will eliminate most of the effect of station dropout

    And if you go through the GISTemp algorithms, they spend a lot of effort homogenizing the data, and dealing with exactly these sorts of problems.

    Watts and D’Aleo barely touched the surface of the types of corrections that get made, and never addressed the question of the degree to which these algorithms fail or succeed in handling some of the issues raised in their 114 page manuscript.

    Beyond that, Watts must be aware of the reason for why there are fewer stations available from 1995 to current in GHCN, that information is public domain (it has nothing to do with data manipulation, nothing. In fact many or most of the stations are still collecting data.)

    So my question is does Watts and D’Aleo really charge that GISTemp or GHCN are “fraudulently cutting out stations”? Can somebody point to me where they made that claim? While I generally agree with what Gavin said, he’s not beneath inventing quotes and strawman to respond against himself.

    Finally, I understand Watts is irritated with Menne’s paper, but seriously, publishing a half-assed analysis like this that is full of rankor isn’t much of a return shot. It just makes them look really unhinged.

  44. Carrick said

    #37 Genghis:

    Has there been any testing to see if the tapered weight function is accurate with real data? It should be trivial to withhold a stations data, process a grid point with the tapered weight function (on the station) and compare it to the actual data.

    Is there any place I can see the results of that test?

    You can down load theClear Climate Code reimplementation of GISTemp or <a href="http://data.giss.nasa.gov/gistemp&quot;.GISTemp itself. Maybe when people get tired of just bitching about the warts that are in the data, they’ll get serious about examining the techniques used in correcting them.

    I’m not saying by any means i think the corrections are perfect, but it would be nice to see full global mean temperature reconstructions performed showing the before/after effects of relaxing or adjusting different corrections. That’s the only way you’ll ever see how much any of these actually matter.

  45. timetochooseagain said

    43-Anthony’s response to Menne et al itself was much more reasonable, I think.

    http://wattsupwiththat.com/2010/01/27/rumours-of-my-death-have-been-greatly-exaggerated/

    Meanwhile, though, we have seen that the IPCC is not open to alternative views, however well supported by extensive evidence:

    http://devoidofnulls.wordpress.com/2010/01/29/not-so-open-to-novel-ideas/

  46. Genghis said

    Carrick

    “I’m not saying by any means i think the corrections are perfect, but it would be nice to see full global mean temperature reconstructions performed showing the before/after effects of relaxing or adjusting different corrections. That’s the only way you’ll ever see how much any of these actually matter.”

    That is my point. How else can they test the accuracy of the corrections? And since they have lots of stations that they are no longer using for data input it seems to me that we can test it ourselves as you suggested.

    If I take the time and effort to test the accuracy of the ‘Corrections’ what is my rejection cut off point? Anything beyond a .1 C anomaly difference?

  47. Kenneth Fritsch said

    Carrick @ Post #43:

    I agree with the essence of what you say in this post and need to add that I get very frustrated with half-assed analyses by the both protagonists and antagonists in this matter. Worse yet is the hand waving that is supposed to pass for “expert” judgments.

    I would think that a site like TAV would do a major service to understanding the problem by setting up a thread to discuss how one would go about determining the statistical uncertainties involved by station coverage and what is the essential observations, data and calculations required. I have been working through these issues and looking at individual station data to at least give me a better feel for what is important and some ball park figures for uncertainties.

  48. Carrick said

    Ghenghis:

    If I take the time and effort to test the accuracy of the ‘Corrections’ what is my rejection cut off point? Anything beyond a .1 C anomaly difference?

    It depends a bit at what you are looking at. If you are looking at global mean temperature (GMT), since land is only 28% or so of the total, a 0.1°C error in the land surface record (e.g., GHCN) would amount to only a 0.03°C error in GMT.

    If you want to restrict yourself to less than 0.1°C in GMT, that means you can tolerate an error of 0.3°C in the land surface temperature reconstruction. Notice that since 70% of the total weighting is from oceans, if we are interested in the global record, we need to be spending more time on the ocean temperature record.

    That said, if you are an empiricist as I am, you are still interested errors in the land record, and simply feel one should get the record as good as one can get it, not just good enough that it doesn’t matter for a given application of the data, then tighter is better.

    For that, I’d say off the cuff that 0.1°C is a good metric for global land temperature. If you restrict yourself to US (smaller land mass equals greater mean fluctuations), I’d use a number like 0.2°C or even higher.

  49. Carrick said

    TTCA:

    43-Anthony’s response to Menne et al itself was much more reasonable, I think.

    I agree. Apparently his long paper was cathartic. ;-)

    Kenneth:

    I would think that a site like TAV would do a major service to understanding the problem by setting up a thread to discuss how one would go about determining the statistical uncertainties involved by station coverage and what is the essential observations, data and calculations required. I have been working through these issues and looking at individual station data to at least give me a better feel for what is important and some ball park figures for uncertainties.

    Yes I agree, and I think there are two things that should be done here.

    One is to go through GISTemp (or CCC) and run numerical experiments testing the effect of the different adjustments, also to independently vet those adjustments. I don’t think blogs are a great place to try and generate full replications, though I wish Jeff the best in doing that if he has time —having somebody with training in modern signal processing techniques go through and redo what amounts to a witches brew of ad hoc algorithms would be very enlightening.

  50. Kenneth Fritsch said

    Nick Stokes @ Post #23

    You’ll note that the v2.mean file is about 43Mb. That’s monthly averaged. It’s likely that for at least some of the data, GHCN would have handled daily records. And handling often means actually typing from hand-written or printed records. Typing of course then means checking.

    I am at a bit of a loss to determine what is the point you are making here. The question remains how is some data entered and others not or delayed in entering. I get from all this that since an involved person like Gavin Schmidt does not have a definitive answer about the current data situation that he and other data set owners are unaware of the details of these issues with their data sets.

    What has been said is that a one time effort was made back at the time of apparently the peak of the station numbers to obtain what must have been considered decent station coverage. Subsequently these station numbers could not be maintained in the data base or as an updated data base because not from a lack of data but from a lack of data entery. Is this really true and has not an electronic system been set up to do this?

    What the hell would this say about the seriousness of the mission of the data set owners? What does a lack of definitive and detailed answers about the problem say. I do not suspect any of these involved people of any skull dugery and that is because I do not suspect anyone without due cause, and further in this case I think they might not have a clue?

  51. Carrick said

    … Oh and the second is to recruit an army of volunteers to back fill the missing station data. (Not all of the data, just regions that are currently underrepresented.) That would be a potentially huge result.

  52. Carrick said

    Kenneth:

    I am at a bit of a loss to determine what is the point you are making here. The question remains how is some data entered and others not or delayed in entering. I get from all this that since an involved person like Gavin Schmidt does not have a definitive answer about the current data situation that he and other data set owners are unaware of the details of these issues with their data sets.

    Isn’t it true that Ruedy does most of the GISTemp work? I think Gavin mostly blogs from work doesn’t he?

    (Just kidding! Actually it’s my impression he’s a modeler not a phenomenologist.)

  53. Genghis said

    Carrick

    “It depends a bit at what you are looking at. If you are looking at global mean temperature (GMT), since land is only 28% or so of the total, a 0.1°C error in the land surface record (e.g., GHCN) would amount to only a 0.03°C error in GMT.”

    Nothing so grandiose. I would just compare the projected anomaly at a particular point with the actual station data at that point. I would think that the anomalies should track perfectly (within instrument error anyway) if the projection theory is correct.

    And since we are routinely talking about tenths of degrees I would expect the rejection level to be lower than tenths. Am I missing anything that I need to be aware of before I try the comparison?

  54. Carrick said

    Genghis:

    And since we are routinely talking about tenths of degrees I would expect the rejection level to be lower than tenths. Am I missing anything that I need to be aware of before I try the comparison?

    I’d say go through some of the stations that Steve McIntyre reviewed. You’ll get a bit of a flavor of what to expect and what to look for that way.e Secondly, pick sites that are in surfacestations and have decent metadata. Don’t start by trying to decode measurement anomalies (e.g., sudden shifts in temperature) if you don’t know what you’re looking at. Pick sites that have other regional temperature records, this will allow you to correlate them against each other, not just pre and post adjustment temperatures.

    Jeff ID and others who’ve also looked at individual records may have some ideas too.

  55. Carrick said

    The rejection rate for a given station doesn’t have to be better than 0.1°C. That’s because individual stations have a lot of local weather fluctuations in them that get averaged out in the global mean temperature.

    The main thing is you need to make sure of when you are adjusting that you aren’t introducing net bias. (By for example only making otherwise reasonable adjustments that push the data in the desire direction, usually when this happens, it is due to confirmation bias.)

  56. Kenneth Fritsch said

    Isn’t it true that Ruedy does most of the GISTemp work? I think Gavin mostly blogs from work doesn’t he?

    (Just kidding! Actually it’s my impression he’s a modeler not a phenomenologist.)

    Actually in these matters and as chief RC moderator, I would suspect that Gavin Schmidt is considered the spokeperson for the data set owners. If he does not have an answer at his finger tips, he at least has access to those who do. He either does not think that it is necessary to provide definite answers or none are immediately available.

  57. Kenneth Fritsch said

    One is to go through GISTemp (or CCC) and run numerical experiments testing the effect of the different adjustments, also to independently vet those adjustments. I don’t think blogs are a great place to try and generate full replications, though I wish Jeff the best in doing that if he has time —having somebody with training in modern signal processing techniques go through and redo what amounts to a witches brew of ad hoc algorithms would be very enlightening.

    I want to initially look at the effects of trend differences for stations located within a 5 x 5 degree grid. On my first look, I suspect that these differences are larger than most casual observers realize or those who wave it off by the statement: “there are good correlations for temperature anomalies for stations locally and regionally” are admitting. We could construct a Monte Carlo experiment, if we have sufficent evidence of what those station to station variations are and how much they variation changes for regions of the globe. We would do the experiment by varying the station coverage and station variability. I am interested in not just estimating the uncertainty in global temperature trends but local and regional ones also.

  58. Laura said

    “The idea that we’re fraudulently cutting out stations is appallingly defamatory and ignorant,” he said. “These people are desperate to come up with some hint of impropriety. The allegations are absolutely without foundation and based on profound ignorance.” quote from Gavin Schmidt

    As a government employee funded by we, the taxpayers, we deserve an answer with an explanation to the question. Can we get an accurate picture of the temperature by eliminating 75% of the raw data? Why did they make this seemingly arbitrary decision? Now that much of the public has reason to be skeptical of the “experts”, this hostile reaction to a reasonable question just further erodes any credibility. Over the past few weeks, we have discovered the grossly over-exaggerated prediction for the Himalayan glaciers and that Trenberth falsely announced to the media that AGW is responsible for extreme weather events. The ICO investigation has determined that the FOI laws were intentionally breached by the UEA experts.
    We pay these scientists to answer our concerns about the environment. Schmidt’s arrogant, hostile refusal to explain this illogical decision should be considered insubordination. We’re paying his salary. Try this with your employer and you can kiss your job goodbye.

  59. Sleeper said

    Re: Laura (Jan 29 15:27),

    Gavin was in RC mode when he made that statement. When he is in ‘government’ mode, he’s very quiet.

  60. GavinIsFunny said

    “These people are desperate to come up with some hint of impropriety.”

    That’s just not true, Gavin. We simply sit back and let your team hide the decline, conspire to defeat FOI requests, cite 2035 glacier melts in IPCC reports, etc. There are so many improprieties that the real challenge is no longer finding them, but keeping them straight. I got news for you Gavin my friend: you have become a predictable and comic figure, many of us are simply laughing at you. You spin and spin and spin but the very transparency you have been resisting is ultimately going to get forced upon you.

  61. Genghis said

    Carrick

    “The rejection rate for a given station doesn’t have to be better than 0.1°C. That’s because individual stations have a lot of local weather fluctuations in them that get averaged out in the global mean temperature.”

    Hmm, like I said, this isn’t a test of the global mean temps, this is a test of how accurate the projection is to a specific location. It should accurately show the anomoly at that location. If the projection can’t do that then how can it possibly produce a global mean temperature?

    “The main thing is you need to make sure of when you are adjusting that you aren’t introducing net bias. (By for example only making otherwise reasonable adjustments that push the data in the desire direction, usually when this happens, it is due to confirmation bias.)”

    I would hope I don’t have to make any adjustments at all. Wouldn’t a station used during the calibration period but later dropped provide a clean comparison?

  62. Carrick said

    Genghis:

    Hmm, like I said, this isn’t a test of the global mean temps, this is a test of how accurate the projection is to a specific location. It should accurately show the anomoly at that location. If the projection can’t do that then how can it possibly produce a global mean temperature?

    The local fluctuations do affect your ability to measure local mean temperature. Believe it or not, it’s typically the largest source of uncertainty.

    I would hope I don’t have to make any adjustments at all. Wouldn’t a station used during the calibration period but later dropped provide a clean comparison?

    You might be able to. If it were me, I’d still bracket it with other nearby stations. That’s the most robust way of spotting measurement anomalies (calling measurement residuals “anomalies” is just one of many abuses of language by the climate people).

  63. Genghis said

    Carrick

    “The local fluctuations do affect your ability to measure local mean temperature. Believe it or not, it’s typically the largest source of uncertainty.”

    Hmmm, I have to think about that. I suspect you are correct (and why I find it hard to accept that projections are valid). If that is the case it makes it very difficult to come up with a satisfactory way to falsify the theory.

    Thank you for your helpful replies.

  64. Peter Pond said

    GavinIsFunny (No. 60)

    You said that Gavin just spins and spins and spins. Perhaps he should be referred to as “Gavin ‘Whirling Dervish’ Schmidt”?

  65. Carrick said

    Genghis:

    Hmmm, I have to think about that. I suspect you are correct (and why I find it hard to accept that projections are valid). If that is the case it makes it very difficult to come up with a satisfactory way to falsify the theory.

    I don’t know your background so I don’t know if this will make sense or not…worth a try:

    The reason that it’s hard to average out fluctuations over a single thermometer is they are highly correlated in time. In the spatial domain, it’s true that over short ranges they are positively correlated, but as you move farther, they go through zero correlation and then to a larger negative correlation, back to zero, positive (but weaker correlation) and so forth. Anyway, as you average over a sufficiently large spatial region, the fluctuations of single microphones tend to average out… (The alternating signs of the correlation is due to the wave-like nature of weather… “Rossby waves” for example).

    Think of waves and averaging the mean surface height… if you just pick a spot, it’s very noisy, but as you increase the area that you’re averaging over, the fluctuations even out… eventually becoming smooth as glass, so the only thing that is left is really long-term up-and-down motions.

    Same idea here.

  66. Chuckles said

    Genghis and Carrick:

    If the source data for this is GHCN, and the source data for GHCN is an NCDC MMTS min/max system, then the incoming data for each day is 2 Fahrenheit temperatures – a min and a max, captured to the nearest degree Fahrenheit.

    The MMTS measuring system has a stated accuracy of +- 0.5 degrees Celsius, and is calibrated annually.

    I cannot see how any amount of averaging, homogenising, protecting or whatever you want to call it is going to produce 0.1 degrees Celsius of anything other than manufactured precision.

  67. Genghis said

    Chuckles, Carrick

    Of course it is a manufactured precision. Just like the average family has 2.3 children. But, we can get value from that precision, if the number goes up we know the population will increase, etc.

    The question to me, is after you subtract the basic cycles, night vs day, and the seasons (all waves), do weather variations or climatic variations dominate? Carrick seems to be suggesting that localized weather dominates overpowering climate signals and I tend to agree. But I seem to get the impression that Climate Scientists think that Climate dominates and that they can get a clear signal and make global and local climate predictions.

    And Carrick I have a reasonable scientific background, a triple E degree, decent in math (unused and rusty though), but mostly I was a computer hack with a couple of games to my credit. Now, I build airplanes and fly them.

    My basic question remains, how do we tease a climate signal out of the weather signal in a falsifiable way? And yes Carrick I see everything as waves.

  68. Carrick said

    Chuckles:

    I cannot see how any amount of averaging, homogenising, protecting or whatever you want to call it is going to produce 0.1 degrees Celsius of anything other than manufactured precision.

    If you’re really interested in answering this question for yourself, make a ruler on the computer then print it off on a piece of paper. Have the distance between each line be 1 cm.

    Then pick a piece of paper, and randomly cut off a length of it.

    Measure this paper 100 times using the ruler. Don’t line up the ruler’s tick marks with the start or end of the length of paper, just count how many ticks you get. If the object were 6.7 cm long, sometimes you’ll count 6, sometimes 7.

    Now sum up the tick marks, compute it’s standard deviation then it’s error of the mean (standard deviation divided by sqrt(100)).

    You’ll end up with an answer that has a precision of 0.1 cm. Now it with a metric ruler (it’s best if you generate this with your computer and printer too, so any distortion by the printer is reflected in both scales).

    If you do this carefully, you’ll find that not only does your reading have a precision of 0.1 cm, it is accurate to about 0.1 cm as well.

    If you have individual readings on a thermometer with a precision of 0.5°C, over a long enough period of measurement the precision of the measurement will eventually be very small compared to other things that control the precision, like the local fluctuations of temperature about its mean. The only way to make that more precise is to increase the area over which the mean is taken.

  69. Carrick said

    Measure this paper 100 times using the ruler. Don’t line up the ruler’s tick marks with the start or end of the length of paper, just count how many ticks you get. If the object were 6.7 cm long, sometimes you’ll count 6, sometimes 7.

    I meant to say, on this step, you need to randomly place the ruler relative to the paper.

    If you get the methodology right, it works. I’ve done it myself.

  70. Carrick said

    Genghis:

    My basic question remains, how do we tease a climate signal out of the weather signal in a falsifiable way?

    What I described is a process for reducing weather to a climate signal. It’s a numerical technique not something that is directly “falsifiable” in the sense of a theory.

    But you can test the validity of the method by creating synthetic data with (as well as you can) the same temporal and spatial fluctuations and with a known temperature trend over time. Then you see how well your method recovers the original temperature trend hidden within the data. This is called a Monte Carlo simulation. You can actually get an objective estimate of how good the technique is by repeating this a number of times to measure the mean and standard deviation of your extracted trend to the actual trend.

    This method not only allows you to measure the validity of your method, but often suggests improvements you can make to get it to more reliably extract the desired trend from the real data. The more closely your simulation matches reality, the more accurate the results of your Monte Carlo will be.

  71. Arthur Dent said

    Surely Carrick this is only true if you are making sequential measurements of the same item, in your example a piece of paper. In the temperature world this would be simulated by measuring the same thermometer at the same site at say 0.1 sec intervals for say 1 minute. I agree that under these circumastances you can improve the precision.

    However we are talking about averaging several thousand different and unrelated things (the actual temperature at the GHSCN measueing stations). Although there are sevral thosusand of these things the actual numnber of independent measurements is only one at each position. How can this then improve on the precision of the emasuremeant at each station?

  72. Chuckles said

    Carrick, Thanks for the response; I have the same problem as Arthur D.

    I can well believe that the MMTS system would do multiple readings and average them,or similar, I’ve used the technique myself on many occasions.

    But we have (e.g.) a single reading of a single measurement on a single site.
    Tomorrow we have another measurement of that site.
    Both readings were made at +-0.5C, and then truncated to the nearest whole degree Fahrenheit, and no amount of averaging is going to change that or improve it as far as I can see?

  73. RB said

    I’m hardly a statistics expert, but I’m guessing that the underlying assumption is that for each station, the temperature series corresponds to a stationary and ergodic process. Then, for each station, the error goes down as 1/sqrt(N) as the number of measurements increases.

  74. Genghis said

    Carrick

    “But you can test the validity of the method by creating synthetic data with (as well as you can) the same temporal and spatial fluctuations and with a known temperature trend over time. Then you see how well your method recovers the original temperature trend hidden within the data.”

    You are assuming that there is a temperature trend hidden within the data (confirmation bias anyone?) and you have already stated that local temperature (weather) wipes that anomaly out.

    If Climate anomalies cannot be seen at individual stations because of weather, what makes you think that averaging random anomalies from those stations will produce a meaningful result?

    Which leads us back full circle to my original question. If there is a climate signal that is discernible through temperature records then the Climatologists should be able to accurately project that anomaly (like tides as climate as opposed to weather waves) to all other stations and to a global climate mean.

    None of the records I have seen indicate that it can be done.

  75. Carrick said

    Genghis:

    You are assuming that there is a temperature trend hidden within the data (confirmation bias anyone?) and you have already stated that local temperature (weather) wipes that anomaly out.

    0°C/century is a trend too. So is -1°C/century.

    The trend, whatever it is, is linearly superimposed with local weather fluctuations. Averaging over large geographical regions removes much of the effect of these local fluctuations.

    If Climate anomalies cannot be seen at individual stations because of weather, what makes you think that averaging random anomalies from those stations will produce a meaningful result?

    If you simply mean this rhetorically in the sense “I’ve made my mind up, don’t confuse me with the facts”, just say that and be done.

    I’ve also already addressed how one tests this hypothesis (spatial averaging removes local fluctuations) and the basis for why one can believe it to be true, and further how one can test it.

  76. Carrick said

    Chuckles:

    Both readings were made at +-0.5C, and then truncated to the nearest whole degree Fahrenheit, and no amount of averaging is going to change that or improve it as far as I can see?

    Not on the individual measurements, but the error in the ensemble still gets reduced by 1/sqrt(N) as long as the fluctuations in the data are large compared to the truncation error

    Simple example. Let’s take the fluctuation level to be ±2°C, the mean value (apriori) to be 20°C, now look at the effect of truncation on 365 observations:

    truncation mean precision
    0.0 21.12 0.10
    0.1 21.12 0.10
    0.2 21.12 0.10
    0.5 21.14 0.10
    1.0 21.10 0.10
    2.0 21.11 0.11
    5.0 20.97 0.12
    10.0 20.25 0.08

    Note: If the truncation error is large compared to the internal fluctuation of the data, averaging the data will not improve the accuracy.

    This effect is well known, so when data get digitized (which is exactly the same as rounding it to a specific accuracy), typically a couple of bits of noise get added to the data before it is digitized. Not doing this introduces digital noise that affects the fidelity of the data/music.

  77. Carrick said

    Sorry typo: The mean value was 21°C, not 20°C.

  78. Mark T said

    RB said
    January 30, 2010 at 2:27 pm

    I’m hardly a statistics expert, but I’m guessing that the underlying assumption is that for each station, the temperature series corresponds to a stationary and ergodic process.

    By definition, if there is a non-zero trend, then it cannot be stationary. This is clearly not an assumption for stations. Furthermore, if it is not stationary, it is not ergodic. The noise in the measurements is probably stationary (uniformly distributed between +/- 0.5 C, for example).

    Then, for each station, the error goes down as 1/sqrt(N) as the number of measurements increases.

    As Carrick noted, no. The noise (or error) goes down as 1/sqrt(N) different stations are averaged, not the noise for each station.

    This is a result of the central limit theorem and only holds if each mean exists, is finite, and the distributions of each of the stations are identical (which is what Carrick alluded to). In general, these conditions only need to be weakly true for close to the 1/sqrt(N) noise reduction, but the CLT gets overused rather frequently (as does the law of large numbers, which is related). It is not a silver bullet that wipes out all noise/error in measurements.

    Mark

  79. Mark T said

    “The noise (or error) goes down as 1/sqrt(N) different stations are averaged, not the noise for each station”

    should have read as

    “The noise (or error) goes down by 1/sqrt(N) as N different stations are averaged, not the noise for each station”

    Mark

  80. Mark T said

    Probably worth noting that if measurement errors have a non-zero mean, then averaging will converge to that value by the CLT (assuming the other conditions are met). This is often a result of a systemic error, such as a 0.5 C bias in the instrument, or always rounding up to the nearest integer reading.

    Mark

  81. Genghis said

    Carrick

    “I’ve also already addressed how one tests this hypothesis (spatial averaging removes local fluctuations) and the basis for why one can believe it to be true, and further how one can test it.”

    But the climate scientists are removing data necessary for spatial averaging, by their actions they have proclaimed that they have removed local fluctuations.

    It also seems to me that there may be some kind of relationship between the number of stations and the length of record needed to garner a climate signal.

  82. Carrick said

    Genghis:

    But the climate scientists are removing data necessary for spatial averaging, by their actions they have proclaimed that they have removed local fluctuations.

    Now you’re moving the goal post.

    But nobody is “removing” data, the data exist they simply haven’t been incorporated into GHCN. And the averaging method used by GISTemp is sufficient to handle most of the effects of changing numbers of stations over time.

    One of the problems I haven’t brought up is that fluctuations associated with nearby stations are very highly correlated. Having too many stations doesn’t help you very much with the primary source of uncertainty, you need a larger geographical area to handle that.

  83. Mark T said

    Correlated signals of any kind, even if it is true noise, won’t averge out, for sure.

    Mark

  84. Kenneth Fritsch said

    One of the problems I haven’t brought up is that fluctuations associated with nearby stations are very highly correlated. Having too many stations doesn’t help you very much with the primary source of uncertainty, you need a larger geographical area to handle that.

    Carrick, that statement needs a link to a published study of the very highly correlated fluccuations of nearby stations.

    In the initial stage of my studies on station to station variablity I have found associations of statistically significant and near significant change points in individual stations in a 5 X 5 degree grid (mid latitudes). That situation does not however mean that the longer term temperature anamoly trends are similar and in fact were found to have high variability.

  85. Carrick said

    Kenneth, It is the spatial correlation of the shorter duration fluctuations that I am more familiar with.

    There is a well-known latitude effect on trend, so you wouldn’t necessarily expect very-long-duration fluctuations to be the same as you move north-to-south (even within a 5° grid point). Change in elevation may act more like a poleward shift in latitude, so there is that too. The papers on studying correlation are pretty old at this point, if you want, I’ll send the GISTemp team an email and ask if they can point us towards relevant published literature.

    I’d love to see something more quantitative from you than this on your results.

  86. Carrick said

    January 29, 2010 at 11:25 am
    #37 Genghis:
    Has there been any testing to see if the tapered weight function is accurate with real data? It should be trivial to withhold a stations data, process a grid point with the tapered weight function (on the station) and compare it to the actual data.
    Is there any place I can see the results of that test?

    You can down load theClear Climate Code reimplementation of GISTemp or <a href="http://data.giss.nasa.gov/gistemp&quot;.GISTemp itself. Maybe when people get tired of just bitching about the warts that are in the data, they’ll get serious about examining the techniques used in correcting them.

    I’m not saying by any means i think the corrections are perfect, but it would be nice to see full global mean temperature reconstructions performed showing the before/after effects of relaxing or adjusting different corrections. That’s the only way you’ll ever see how much any of these actually matter.

    ***********

    Carrick.

    1. The clearclimate code didnt work for me last time I downloaded it. They have a new version I was going to test.

    2. I suggested just a study to EM smith, but the code is brittle to stations removal. It breaks and wont run.

    Anyways, The right way to proceed it would seem to me is to

    A. test if the removal of stations has any impact by removing them in prior years.

    Also, Nick has talked a bit about how the anomaly method cures all. Upon review of the code, I dont think we can say that any more.

  87. Carrick

    But nobody is “removing” data, the data exist they simply haven’t been incorporated into GHCN. And the averaging method used by GISTemp is sufficient to handle most of the effects of changing numbers of stations over time.

    I think it’s an open question as to whether the averaging method of GISStemp can handle the removals.

    1. See McIntyre’s post on the reference method:

    http://climateaudit.org/2008/06/28/hansens-reference-method-in-a-statistical-context/

    2. From my understanding of the code Stations temps are adjusted and averaged and combined and then an anomaly is taken
    as opposed to taking anomalies first for each station and then averaging. ( on a local basis )
    Open to debate I suppose, but perhaps EMsmith can weigh in since he has worked with the code more recently
    than I have and he actually has it running
    3. I asked EM smith to try to do a basic check on the impact of removal and the code, according to him, would fail. he is looking at it.

    The point is since we have the code, it should be easy in principle to see the effect of station removals rather than speculate about whether an alogorithm will handle things correctly or not.

  88. Carrick said

    Steven Mosher:

    The point is since we have the code, it should be easy in principle to see the effect of station removals rather than speculate about whether an alogorithm will handle things correctly or not.

    I agree with this. I also understand there have been a lot of updates recently to GHCN (see CA post), so some testing may be possible very soon.

    There are reasons I would expect it to work, at least if implemented correctly, I’ve outlined them above. It’s always fun to see whether expectation meets reality!

  89. Re: steven mosher,
    “From my understanding of the code Stations temps are adjusted and averaged and combined and then an anomaly is taken as opposed to taking anomalies first for each station and then averaging. ( on a local basis )”
    “averaged and combined” – I think this is local. A weighted average of local temperatures is made for each grid point, taking stations within a radius of RCRIT (256km or 1200km). The weight is a linear taper with distance, down to zero at RCRIT. The average 1951-1980 for that series on the gridpoint is then subtracted to create the gridpoint anomaly.

    If the local stations had no missing data in 1951-80, this would have the same effect as calculating station anomalies, and then grid-averaging. If they do, then the effect is the same as if you had calculated station anomalies, replacing missing 1951-80 data by the local grid-point average calculated with that tapered RCRIT weight – ie a reasonable local approxinator. So there is not much difference between this method and the climate anomaly method, where individual station 1951-80 means are calculated with an interpolation formula for missing years from nearby stations. It just implicitly makes use of a reasonable gridpoint centred interpolation formula.

  90. boballab said

    How GISS does the analysis is in the Hansen et al 1999 paper.

    4. Combination of Station Records
    4.1. Records at Same Location
    We first describe how multiple records for the same location are combined to form a single time series. This procedure is analogous to that used by HL87 to combine multiple-station records, but because the records are all for the same location, no distance weighting factor is needed.

    Two records are combined as shown in Figure 2, if they have a period of overlap. The mean difference or bias between the two records during their period of overlap (dT) is used to adjust one record before the two are averaged, leading to identification of this way for combining records as the “bias” method (HL87) or, alternatively, as the “reference station” method [Peterson et al., 1998b]. The adjustment is useful even with records for nominally the same location, as indicated by the latitude and longitude, because they may differ in the height or surroundings of the thermometer, intheir method of calculating daily mean temperature, or in other ways that influence monthly mean temperature. Although the two records to be combined are shown as being distinct in Figure 2, in the majority of cases the overlapping portions of the two records are identical, representing the same measurements that have made their way into more than one data set.

    A third record for the same location, if it exists, is then combined with the mean of the first two records in the same way, with all records present for a given year contributing equally to the mean temperature for that year (HL87). This process is continued until all stations with overlap at a given location are employed. If there are additional stations without overlap, these are also combined, without adjustment, provided that the gap between records is no more than 10 years and the mean temperatures for the nearest five year periods of the two records differ by less than one standard deviation. Stations with larger gaps are treated as separate records.

    From there Dr. Hansen discusses the advantages and disavantages of the system.

    4.2. Regional and Global Temperature
    After the records for the same location are combined into a single time series, the resulting data set is used to estimate regional temperature change on a grid with 2°x2° resolution. Stations located within 1200 km of the grid point are employed with a weight that decreases linearly to zero at the distance 1200 km (HL87). We employ all stations for which the length of the combined records is at least 20 years; there is no requirement that an individual contributing station have any data within our 1951-1980 reference period. As a final step, after all station records within 1200 km of a given grid point have been averaged, we subtract the 1951-1980 mean temperature for the grid point to obtain the estimated temperature anomaly time series of that grid point. Although an anomaly is defined only for grid points with a defined 1951-1980 mean, because of the smoothing over 1200 km, most places with data have a defined 1951-1980 mean.

    http://pubs.giss.nasa.gov/docs/1999/1999_Hansen_etal.pdf

    Notice that all averaging is done before Anomaly’s are made. The first quote corresponds to Step 1 of the GIStemp current analysis page:

    Step 1 : Simplifications, elimination of dubious records, 2 adjustments (do_comb_step1.sh)
    ———————————————————-
    The various sources at a single location are combined into one record, if possible, using a version of the reference station method. The adjustments are determined in this case using series of estimated annual means.

    The second quote corresponds to Step 3 of the GIStemp current anaylsus page:

    Step 3 : Gridding and computation of zonal means (do_comb_step3.sh)
    ————————————————
    A grid of 8000 grid boxes of equal area is used. Time series are changed to series of anomalies. For each grid box, the stations within that grid box and also any station within 1200km of the center of that box are combined using the reference station method.

    http://data.giss.nasa.gov/gistemp/sources/gistemp.html

    So what about Step 2? Well that is where they homogenize the data before making Anomaly’s.

  91. Re: boballab,
    “Notice that all averaging is done before Anomaly’s are made. “
    Bob, I’m not sure what point you are making here. But I think the averaging referred to here is very limited. It’s just the process of combining different record fragments for the same station.

    They calculate an annual anomaly average in Step 2, but it seems to have limited use. The main initial averaging between stations is done in step 3, toSBBXgrid.f, where they do this process of deriving a grid point value from stations within a radius of RCRIT, and at the same time compute a 1951-80 average, and subtract to get a gridpoint anomaly.

  92. Phil said

    Wouldn’t it matter whether or not the anomaly calculation method described by Hansen were a “reasonable local approximator” how many degrees of freedom one had in performing the grid point averaging and the grid point base period mean temperature calculations? Apparently there remain only about 1,500 stations for 8,000 grid cells or about 19%. Thus, wouldn’t there be zero degrees of freedom for the vast majority of the grid cells spatially (there may be varying degrees of freedom temporally, depending on how long a given station(s) records may be)? And, even if you had 8,000 stations distributed so that there was exactly one per grid cell, wouldn’t you still have zero degrees of freedom spatially? Not sure, just trying to figure it out.

  93. RE 91.

    Nick, I’m not so sure its fragments from the same station. have a look at the code. You cant compile the written paper.
    In any case the definitive test of your assertions is running the code, not looking at a document that purports to describe the code.

  94. E.M.Smith said

    I’ve only skimmed about half the comments so far, but a couple of points. First, if you are going to stick to the “One and Done” part of the story that GHCN was just made in 1990 and it’s an artifact, then you must explain the “medium sized dying of thermometers” in 2006. There is a nice graph of it here:

    http://noconsensus.wordpress.com/2010/01/19/long-record-ghcn-analysis/

    in an article that does a very nice job of also confirming my observation that the long lived thermometers are not the carriers of the “warming signal” but it is in the short lived thermometers (that are the ones showing up in those warmer places…).

    As an example of the 2006 dying, we have Madagascar:

    http://chiefio.wordpress.com/2010/01/31/mysterious-madagascar-muse/

    where I also give you two GISS generated ‘anomaly maps’. One showing Madagascar with no data (settings to suppress ‘smoothing’ spreading of data) and the other showing the (fictional?) warm blob you get by default.

    That same link also somewhat ‘puts the lie’ to the whole story about “Slow to report” in that there is a link to Wunderground for Madagascar with “now” data in it…. So if it is a “slow to report AND PROCESS” issue, my response would be a simple: OK, dump NCDC and let a contract to Wunderground. They can get the data just fine.

    Finally, on the “The Anomaly Will Save Us!” story:

    The anomaly is not computed as ‘self to self’ it is computed as “self to basket after the basket has had adjustments, homogenizing, UHI “corrections” (that often go the wrong way – in the case of Pisa Italy by 1.4 C the wrong way), etc.”. The anomaly can not protect you from the things done to the data prior to it being calculated at the very end. Further, I think there is a bug in the anomaly code (or at least in the anomaly mapper) that does not handle missing data well and handles null anomalies worse:

    http://chiefio.wordpress.com/2010/01/31/is-the-null-default-infinite-hot/

    So you can wave the “Hypothetical Cow” anomalies around all you want; I’m going to keep asking “Where is the beef IN THIS CODE?”.

    Oh, and on the slight upward tilt to the naive full average issue. Yes, the full average is just about dead flat. This was the first thing I did and found it uninteresting. Spent about 6 months after that working on GIStemp and NOT on GHCN. Then ran into a different “What?” moment and took a more detailed look at the data patterns. That is when I found the “by altitude” and “by lattitude” and “by airport” biases in the data. The following is an excerpt from an email to someone else who had raised this issue toward the end of 2009. So give Nick about 9 months to catch up.

    Begin Quote:

    One of the first things I did was a gross average, like that Fig. 6, and found the very mild up-tilted trend line. The Fudging of GHCN is more subtile that that. IMHO, it interacts with GIStemp (and most likely CRU as well) …

    So you have thermometers leave the Andes and you get a fictionally hot Bolivia (where no data is newer than 1990, yet the present GIStemp anomaly map shows it a nice red…). You have thermometers move from the cool ocean currents of Morocco shores and into the Atlas Mountains on the edge of the Sahara (and Morocco gets a nice red rosy patch on the anomaly map too…)

    So the question is: How is this done, but with lesser impacts on the gross averages? And I don’t really know. I can only guess. (It’s on the “someday” list…). There has to be an offsetting cold number somewhere, but in a place that does not have an impact on the anomaly maps. So I’d guess that in some place with a cold record, a duplicate is right next to it. Perhaps a short lived (under 20 years, so GIStemp tosses it out) or perhaps just a duplicate with mod flag (so it gets ‘merged’). There are several ways I can think of to do this…

    Or perhaps just a small part of some geography that can be handwaved away in some manner. So, for example, Canada has a cooling gross average despite losing the Rocky Mountain thermometers. We color all of Yukon, Northwest Territories, Nunavut, and near by areas a nice red since only one thermometer survives there and it is in the “Garden Spot of the Arctic”, but raise the percentage of stations clustered together in, as a hypothetical, Ontario. As a gross average, you still get near neutral, but on the anomaly map, most of Canada looks red.

    Basically, i think the “game” is to make large areas of cold to be covered by a ‘nearby’ warmer thermometer, but have a couple of ‘cool’ ones covering a single place such that they don’t impact the anomaly map disproportionately, but do offset the ‘fudge’ in the large areas.

    END QUOTE.

    Now this could all be perfectly innocent and nothing but an artifact of a misplaced belief. I’ve run into a strong thread in the “warmer” community of the belief in the perfection of the anomaly. It will protect from all thermometer changes. (And it well might if done as ‘selfing’ where all thermometer were, up front, turned into an anomaly only against themselves).

    Given that belief, I could easily see someone saying “Bolivia is a PITA because they always send data a week late” or “Mountains have more data loss in winter due to snow outages, lets drop them to improve data stability” since, after all, “it is all just anomalies so it doesn’t matter”.

    The problem is that GIStemp must then be ‘a perfect filter’ and it is not. You don’t need a major failure, only a minor lack of perfection. When 90% or so of your thermometers show location and survivor bias, even a 10% leakage of that bias will give you a significant error in the output. And I’ve done a benchmark that demonstrates that GIStemp is NOT a perfect filter:

    http://chiefio.wordpress.com/2009/11/12/gistemp-witness-this-fully-armed-and-operational-anomaly-station/

    So you can talk all the hypotheticals you want. I’ve run NCDC data through GISS code and with the thermometer ‘drops’ being exactly the ones that were in the GISS product from 5/2007 to 11/2009 (during the USHCN to USHCN.v2 transition data outage) AND THE ANOMALY CHANGES.

    So we have an existence proof that it isn’t a perfect filter. Now were just haggling over ‘how much’…

    Other “issues’ with the “GHCN is fine” response are in the posting here:

    http://chiefio.wordpress.com/2010/01/27/temperatures-now-compared-to-maintained-ghcn/

  95. Steve #93,
    I am looking at the code, tho’ indeed I haven’t run it. In Step 1 it echoes what it is doing. It says, emphatically:
    echo “Combining overlapping records for the same location:”
    echo “Fixing St.Helena & Combining non-overlapping records for the same location:”

  96. E.M.Smith said

    4.2. Regional and Global Temperature
    After the records for the same location are combined into a single time series, the resulting data set is used to estimate regional temperature change on a grid with 2°x2° resolution. Stations located within 1200 km of the grid point are employed with a weight that decreases linearly to zero at the distance 1200 km (HL87).

    In other words: We make a basket of records averaged together.

    We employ all stations for which the length of the combined records is at least 20 years; there is no requirement that an individual contributing station have any data within our 1951-1980 reference period.

    In other words, the baseline grid cell average can be a different basket of records.

    As a final step, after all station records within 1200 km of a given grid point have been averaged, we subtract the 1951-1980 mean temperature for the grid point to obtain the estimated temperature anomaly time series of that grid point.

    In other words: THEN we make an anomaly by comparing these two baskets of different things.

    Although an anomaly is defined only for grid points with a defined 1951-1980 mean, because of the smoothing over 1200 km, most places with data have a defined 1951-1980 mean.

    In other words: We can take 1500 records and stretch them into 8000 grid boxes by reusing a lot of them and with many cells having only one record, from somewhere else, and not always the same one that was in the baseline for that cell.

    Which is the whole point I’ve been trying to get across to folks about the anomaly will not save you because:

    1) It is done AFTER all the temperatures are used for ‘fill in’, homogenizing, and UHI “correction’ that often goes the wrong way.

    2) It is not done via thermometer ‘selfing’ by via “apple basket” to “orange basket”.

  97. Carrick said

    EM Smith:

    UHI “correction’ that often goes the wrong way

    This bothers me when people make claims like this and don’t back them up.

    What percent of UHI “corrections” went the wrong way?

  98. Carrick said

    Come to think of it, since EM Smith is insinuating that the GISTemp UHI temperature is increasing the temperature trend, it should be easy enough to test this. Just compare the difference in global mean temperature with and without the UHI correction.

    That’s one validation test. What others? With and without homogenization?

  99. vjones said

    #97 Carrick,
    Detailed by Steve McIntyre: http://climateaudit.org/2008/03/01/positive-and-negative-urban-adjustments/

    Some of them are, quite frankly apalling. Hall of Shame here (and I’ve since discovered that these are far from the worst):

    http://diggingintheclay.blogspot.com/2009/11/how-would-you-like-your-climate-trends.html

  100. Carrick said

    VJones, the question wasn’t whether mistakes were made by how frequent (what percentage) were mistakes, and what impact they had on the final product, the global mean temperature?

    Just a general suggestion, when people find errors, they should also send them to Reto Ruedy. I assume you and EM Smith have, right?

  101. Carrick said

    make that “were made, but how frequent…”

    Also, I’ve downloaded and run ClearClimateCode. It does all the chores of downloading the files from GISTemp for you, so you can do a complete build in a few minutes (assuming you have python installed).

  102. Carrick said

    The whole CCC build of GISS is invoked with this single run:

    python tools/run.py

    You don’t get much more turnkey than that.

    I also compared the output to GISS, and the maximum error was ±0.03°C (the residual had a trend on the order of 1e-4°/century, so no net influence of adding the extra station data). Not identical I think because of recent changes to the GHCN database itself, and GISTemp only gets rebuilt once a month. When run using the same input data, the CCC authors claim reproduction to the 0.01 rounding error.

  103. Andrew said

    97-Basically he’s assuming that UHI Correction should ALWAYS reduce warming, and not that it adds as often as it subtracts. This is does strike me as odd but without careful examination I can’t say that it isn’t ever justified.

  104. Carrick said

    Andrew:

    Basically he’s assuming that UHI Correction should ALWAYS reduce warming, and not that it adds as often as it subtracts.

    Yeah, I understand that. One would assume the net effect of the UHI correct is to reduce the temperature trend.

    The objective questions are 1) does the UHI correction reduce the temperature trend? 2) is it applied incorrectly in some cases or does the algorithm fail? 3) For what fraction of the UHI corrects, does it increase warming?

    The problems I have with some of the critics of GISTemp is that they are basically pointing to “warts” in the methodology. Every complex code that handles such a range of issues as this code does is going to have warts. The proper question is “how much of a difference does a given wart affect the final answer”?

    And my other point is, if you spot a wart that really bothers you, point it out to the code maintainers and suggest a fix for it.

  105. Carrick,

    I’ve not had the same luck with CCC. I posted my crashlog to them a while back and they sent me mail about downloading the latest version.
    From chatting with others who had similar issues ( I think with a missing package for python ) I probably need to sit down and devote some attention to my environment ( its not set up for code development) Since others have it running, the fault probably lies with my set up ( python version I’m pretty sure) So I’m gunna hold off comment till I can give it proper attention ( after feb 10th if I am lucky)

  106. Carrick:

    Validation tests/sensitivity:

    WRT UHI.

    A. First I think would be a clear benchmark of what GISS says if only Rural sites are used.
    I don’t think it makes sense to adjust for UHI and I certainly don’t buy Hansen’s approach.
    let me contrast an adjustment like a thermometer change or change in altitude to a UHI change.
    With a change in thermometer ( from LIG to MMTS for example) or change in altitude, one can
    test the two thermometers side by side and model the difference or estimate the difference. That
    step change can either be accounted for with an adjustment or by employing first differences to
    estimate trends. The same with Altitude changes or TOBS changes. Discrete event. Model the change
    or estimate the change, decide how to handle it. With UHI you’ve got a process that happens over time.
    hansen’s hingepoint approach has nothing to reccommend it except that it is easy.
    B. Next I would look at the definition of Rural
    1. Nightlights has problems
    2. Population figures need to be updated.

    C. I’d look at the station combining process and look at the sensitivity in that proceedure to the “magic numbers”
    1. the 20 year overlap requirement
    2. the 1200km/600km criteria. Those numbers are pulled from a very thin study of correlation that was only done
    ( I recall) on NH sites and the criteria wasn’t held very high.

    The point on C would go to confidence limits I suspect.

    D. Polar extrapolations. Tilo did some nice investigative work on that. bears scrunity.

    The other test I would do would require a bunch of work. That would involve a different approach altogether, one that doesnt rely on adjustments or homogenization. basically first differences

  107. EM

    1) It is done AFTER all the temperatures are used for ‘fill in’, homogenizing, and UHI “correction’ that often goes the wrong way.

    2) It is not done via thermometer ’selfing’ by via “apple basket” to “orange basket”.

    I’m not following your analogy on number2. psudo code or math or something to help me understand

  108. vjones said

    #100 Carrick,

    You make a good point about going back to GISS with the errors. I was only dabbling when I produced that hall of shame. But since getting serious about it I have been working mainly with NOAA GHCN data, which is just the input to GIStemp. I am about to turn my attention again to GISS adjusted data. I don’t think they (or anyone) would thank me for pointing out just the odd station here or there where the UHI correction is “the wrong way”, however when I do have the full list, by station code, which will not take long, I will follow it up. I would hope I would also be in a position to quantify the effects.

    I should also point out that this UHI maladjustment issue goes back two years (at least) on Climate Audit and therefore I would assume that Gavin already is aware that some stations are warmed rather than cooled by the correction algorithm. When I have numbers to compare with those on Climate Audit, perhaps these will suggest that some corrections have been made. Checking the stations that Steve McIntyre featured as warmed by UHI correction I have not seen any evidence of such correction so far.

    As for running CCC-GIStemp I would love to, but code is not my thing and just at the moment I am getting an awful lot from doing it a different way.

  109. LETS START with what Nick says:

    If the local stations had no missing data in 1951-80, this would have the same effect as calculating station anomalies, and then grid-averaging. If they do, then the effect is the same as if you had calculated station anomalies, replacing missing 1951-80 data by the local grid-point average calculated with that tapered RCRIT weight – ie a reasonable local approxinator. So there is not much difference between this method and the climate anomaly method, where individual station 1951-80 means are calculated with an interpolation formula for missing years from nearby stations. It just implicitly makes use of a reasonable gridpoint centred interpolation formula.

    LETS LOOK AT WHAT GISS SAYS:

    We employ all stations for which the length of the combined records is at least 20 years; there is no requirement that an individual contributing station have any data within our 1951-1980 reference period.

    Now, recall that one of the problems Jones had with moving the reference period in IPCC to 1971-2000 was that his method required
    that the station have data in the 1961-90 period, and that he questioned GISS on their weird way of handling this.

    If would appear to me that if you have a simple case

    2 stations with 1 at high altitude and 1 at low altitude, that if you take their anomaly with respect to themselves and then average the
    2 anomalies you get the average anomaly. So, in one grid if the station on a mountain top recording anomalies of 1,1,1,1
    and one standing at the base of the mountain recording anomalies of 1,1,1,1. you get an average of 1. doh.

    And if you first average temperatures and you have the same stations, you might be averaging 0C from the stations at the mountain top
    with the stations at 10C at the base of the mountain. Again, if you average these to 5C and take the anomaly then you get an anomaly of
    1,1,1,1.

    But If you drop out the one at the top of the mountain halfway through and then average you get

    5,5,10,10

    And the anomaly method cannot save you.

  110. vjones said

    #109 Steven Mosher,

    I only have the GHCN data in easily interrogable form in front of me, but this is the input for GISS: there are 410 unadjusted stations with at least 20 years of data, for which the data series ENDS during the GISS base period of 1951-1980 (This is out of just over 4500 of the stations that make it though to the final adjusted GHCN set, i.e. roughly 10%). These, according to your description, have the potential to affect the anomaly calculation in any grid square with which they are associated.

    I am glad this issue is getting so much attention as I too fell for “the anomaly will save us” (and argued with EM Smith about it, which is not something you want to do…), however I have now come to see this issue as fundamental to the whole Global Average Temperature, however fundamentlly flawed a concept that is (IMO).

  111. Nick Stokes said

    Steve #109,
    I don’t think it works like that. To simplify your example further, suppose the mountain station always measures 0C, and the sea level one 10C. And suppose they are equidistant from gridpoint G, and the only stations contributing.

    Then the average 1951-80 at G is 5C, and the months in 2009 will all report an average 5C – anomaly 0 at G, as it should be.

    Suppose in March 2009, the mountain station is dropped. Then the 1951-80 average is recalculated without it, and the 1951-80 average is now 10C, as is the observation. Anomaly still zero, as it should be.

    Where you do have a problem is if the mountain station had, say, 50% missing data during 1951-80. Then the average reported for that period is 6.66C, and in Feb 2009, the anomaly is 1.66C. As long as both stations are included, this is constant, so no real problem, but in March, it drops out and the anomaly drops to zero.

    That’s an extreme case; normally there are more stations contributing, and I believe that a station with less than 20 years in the base period would not be accepted. GHCN is also now not keen on mountaintops. These issues influence the rather restrictive policy re ongoing stations.

  112. Vjones and Nick.

    Long ago I raised this issue on CA and Hu straightened me out, which put me in Nicks camp. And then I posted on WUWT essentially agreeing with Nick. Then I went back and read some more of EM Smith, thought back to how the code worked ( the station creation code is a devilish piece of work) anyway’s I decided that rather than state anything categorically I’d much rather start with some testing of assumptions about
    the removal of stations. For example if we are down to 1500 stations or so from a high of whatever in the 50-80 period, then a simple test would be to run the whole thing with just the surviving stations. EM has already been there done that and the results were a code crash.. or rather he said the code was brittle to station removal. from what NASA write it would appear that you have some stations that are included that dont have a representation in the 50-80 era.

    Without some solid cases to step through the algorithm with I’m just stuck, although I see points on both side obviously.

    A simple test case like madagascar might be a good one to look at. The other thought I had was to look at canada
    North of 60N both in the UAH (RSS) data and in the GHCN.

    But I’ve got like zero time.

  113. E.M.Smith said

    Garrick said
    This bothers me when people make claims like this and don’t back them up.

    What bothers you is not my problem, frankly. I’ve got a whole blog site up looking at GIStemp stuff and, as vjones pointed out, this issue has been kicking around for a couple of years at other sites as well. In particular, I went through the case of Pisa Italy that gets a 1.4 C “correction” in the wrong direction. You can find it in about 1/2 minute on the blog. Less with a simple Google search:

    Reults 1 – 10 of about 113 for GIStemp UHI Pisa wrong way

    or for more broad coverage of the topic:

    Results 1 – 10 of about 2,340 for GIStemp UHI wrong way
    Results 1 – 10 of about 9,280 for GIStemp UHI error

    It’s not exactly a hard problem to find… and probably not something where I ought to be cluttering up someone else’s blog with a few thousand links… and it’s been discussed on WUWT with various blinker graphs for a couple of years with folks pondering why things are so bizarrely adjusted.


    What percent of UHI “corrections” went the wrong way?

    I don’t really care (though IIRC others found about 1/4 to 1/2 in ROW). Even ONE shows they did their code wrong. UHI NEVER ought to be negative. Either put in a “if negative do nothing” line of code or figure out why you got a “wrong way correction” and fix the method, but to leave it as is simple admits you have buggy code.

    BTW, if there are not sufficient “rural” stations PApars.f just passes the data through “uncorrected”. So station dropouts will increase the percent of ‘non-adjusted’ in your “adjusted” result. Expecting that GISS has done a UHI adjustment when it has not is an issue.

    Further, the dropout have caused, IIRC, 500 or so out of about 1200 stations used as “rural” to be, in fact, airports. (and many without the ‘airstn’ flag set have “AIRPORT” or a substring of it in the description field, so that number is low.) Airports are known to have a warming bias. So on the strength of that statistic we can expect very significant error in the UHI that is done. One airport was used about 270 times to “correct” neighbors. That station is getting a new Industrial Park and has a Air National Guard regional facility… (details on the blog under the “agw and gistemp issues” category, look for ‘airports correct UHI” and for “most used airport for UHI”)

    Otherwise, if you don’t fix the code as described, you are not doing UHI correction, you are doing adjacent station correlation adjustments… and with 92% of the stations in the USA in GHCN being airports, and about 86% in New Zealand being airports (all but one that is located on a northern more tropical island: Raoul) , this is going to give a very biased result.

    (And yes if you do a google:
    Results 1 – 10 of about 103 for GIStemp New Zealand Raoul
    my blog page describing this comes up as #3 on the list. )

    http://chiefio.wordpress.com/2009/11/01/new-zealand-polynesian-polarphobia/

    As per telling the maintainers that the have a bug: they know this already. I’m not going to waste my time.

    I have to make a living to support this Climate Analysis ‘habit’ and that takes priority. If anyone wants to give me a paycheck to do this stuff, then sure, I’ll be glad to accommodate “what bothers them”. As long as it’s my nickel, I accommodate “what bothers me” first. Leftover time then goes to “what bothers the spouse and kids”. After that is “what bothers friends”. Then “what bothers the cats” and “what bothers the various governments” that think they own my life. Everybody else comes after that. Take a number…

    Steven Mosher siad

    2) It is not done via thermometer ’selfing’ by via “apple basket” to “orange basket”.

    I’m not following your analogy on number2. psudo code or math or something to help me understand

    “Selfing” is a term from biology / botany. It means a plant was self fertilized. I’d expected it to be a cognate (structurally if nothing else) that I used to mean “compare thermometer A in 2009 to thermometer A in baseline”. The “by” ought to have been a “but by”, that omission doesn’t help understanding… And the “Apples to Oranges” was meant to be a metaphorical form of: “Basket in baseline interval is composed of thermometers B, C, E, and F” compared to “Basket in present is composed of thermometers A, D, F, and G” so making a non-self “anomaly” between those two baskets will ‘have issues’ (as you described in your comments just a bit above this) even with some ‘magic hand waving’ attempts at adjustments.

    Sometimes trying to keep the verbage down to less than 10 pages is less helpful than I’d hoped ;-)

    @vjones: Thanks for fielding that one. I don’t get here often so it was a pleasant surprise to see ‘one less demand for me to deal with’ ;-)

  114. Carrick said

    EM Smith:

    I’ve got a whole blog site up looking at GIStemp stuff and, as vjones pointed out, this issue has been kicking around for a couple of years at other sites as well. In particular, I went through the case of Pisa Italy that gets a 1.4 C “correction” in the wrong direction. You can find it in about 1/2 minute on the blog. [etc etc etc/i>]

    Wow. Take a valium, dude.

    Simply finding mistakes isn’t a big deal. Finding ones that affect the outcome in a statistical sense is.

    You keep bringing up individual stations from geographically inconsequential weightings of the global mean. Even if New Zealand is 3x higher than it should have been (arguably it’s not that far off), it’s only about 0.05% the surface of the Earth, that translates into a 0.15% change (when the error bar is 10%).

    My point here is the central question in QC is “how meaningful is the effect of this error”? One focuses on the errors that have the most critical effect first.

    For the record, I wasn’t asking you to do any work for me.

    I fully plan on doing the tests I find interesting myself. I’ve already done a crude version of UHI versus no UH correction, and what I find is a 0.03°C/century difference in trend (no UHI being higher). That’s right in line with what Jones (1990) claims. It’s interesting but it’s not statistically significant (the uncertainties are more like 0.15°C/century according to my analysis).

    My suspicion is the lump-sum of all of the corrections you’ve raised amount to less than a total of 0.1°C/century. Barely even statistically relevant, and no where near what one would need to turn a positive slope of say 1.5°C/century into a zero slope.

    My best to the Mrs.

  115. Harold Vance said

    Jones 1990 should be thrown out of the peer-reviewed literature set. I’m shocked to see that the Guardian smells fresh blood there:

    The leaked emails from the CRU reveal that the former director of the unit, Tom Wigley, harboured grave doubts about the cover-up of the shortcomings in Jones and Wang’s work. Wigley was in charge of CRU when the original paper was published. “Were you taking W-CW [Wang] on trust?” he asked Jones. He continued: “Why, why, why did you and W-CW not simply say this right at the start?”

  116. KevinUK said

    #104 carrick

    “The problems I have with some of the critics of GISTemp is that they are basically pointing to “warts” in the methodology. Every complex code that handles such a range of issues as this code does is going to have warts. The proper question is “how much of a difference does a given wart affect the final answer”?

    In answer to “The proper question is “how much of a difference does a given wart affect the final answer”?”

    Sorry No! Totally disagree. If it’s a ‘pig in a poke’ its a ‘pig in a poke’! If as is the case the so called UHI correction doesn’t do what it is claimed it is supposed to do and what the code actually does isn’t what the documentation claims it does, then IMO it’s a ‘pig in a poke’ and needs to be scrapped and not corrected.

    “And my other point is, if you spot a wart that really bothers you, point it out to the code maintainers and suggest a fix for it.”

    Well Willis E spotted a serious wart with Darwin, I showed that Darwin is but one case of hundreds of cases of where ‘cooling turned to warming’ and conversely ‘warming turned to cooling’ as a consequence of NOAA’s physically unjustifiable adjustments to the raw data. Willis E notified NOAA and they have agreed with him that their adjustment algorithm (‘wart’) needs to ‘improved’. They are in the process of doing exactly that and assuming they are working to their stated timescale we should be seeing this particular ‘wart’ removed very shortly. Sadly I doubt if it will be, but instead will just undergo some cosmetic surgery instead only to grow back again at some later date.

    By the way I don’t think GISTemp is a complex piece of code at all. It’s just very poorly written and very badly documented. What they are attempting to do in GISTemp isn’t exactly ‘rocket (NASA) science’ either. It’s high time that the responsibility for updating and maintaining the instrumented temperature record to be removed from vested interested groups like the UK Met Office/CRU and GISS.

  117. KevinUK said

    #114 carrick

    “You keep bringing up individual stations from geographically inconsequential weightings of the global mean. Even if New Zealand is 3x higher than it should have been (arguably it’s not that far off), it’s only about 0.05% the surface of the Earth, that translates into a 0.15% change (when the error bar is 10%).”

    I’m sure the people of New Zealand, especially those paying ‘green taxes’ would disagree with you. As a UK taxpayer for sure I totally disagree with your logic that it doesn’t matter if the UK Met Office fiddle with the UK historical temperature records and in doing so ‘get it wrong’ because the UK is ‘geographically inconsequential’.

    “My point here is the central question in QC is “how meaningful is the effect of this error”? One focuses on the errors that have the most critical effect first.”

    No! No! No! What’s central is that the code has been shown to have demonstrable significant errors and that as implemented, it disagrees with its documentation. Like E M Smith, as someone who makes his living from writing software and who therefore understands that software is useless if it doesn’t conform to it’s written specification, I find some of your statements (like this one) bizarre!

  118. Carrick

    My point here is the central question in QC is “how meaningful is the effect of this error”? One focuses on the errors that have the most critical effect first.

    Well if you have a complete list to start with of course. But that’s not the case here. you have some very opaque code and as you find oddities you fix them. Like the UHI adjustment. The two legged adjustment doesnt even have a check for the direction of the adjustment. Now, theory tells us this ( see peterson 2003 very first paragraph) UHI will cause a warming. When Peterson did not find a warming he POSTULATED ( his word)
    that the urban site was in a cool park ( supported by Oke’s work) So sometime an urban site can be like a rural site. Peterson ALSO noted
    (citing wang) that sometimes rural sites were warmer, and peterson citing Wang argued that these rural sites were contaminated.

    Hansen made an algorith that adjusts Warm cities down or up. That makes no sense.

    You have these cases: Urban (infected) rural(infected) urban(cool park) Rural (pristine) U(i),R(i) U(c) R(p)

    U(i) R(i)
    U(i) R(p)
    U(c) R(i)
    U(c) R(p)

    in case 4 you would see no adjustment
    in case 3 Hansens algorith would jack up the Urban: WRONG
    in case 2 hansens algorithm would adjust Urban down: correct
    in case 1 it would depend upon the level of infection in each But in both cases the adjustment would be wrong. wrong in magnitude
    or wrong in direction.

    The KEY is getting good rural stations. one set of checks could to be:

    1. Population TODAY of less than 5K ( the dates for GISS are 1995 population.
    2. ALSO nightlights = dark ( this should be updated to CURRENT sat images and no one from 1995 that only covers the US )
    3. not at an airport.

    If you have a truly rural site then hansens adjustment “should” work.

  119. Carrick said

    KevinUK:

    I’m sure the people of New Zealand, especially those paying ‘green taxes’ would disagree with you. As a UK taxpayer for sure I totally disagree with your logic that it doesn’t matter if the UK Met Office fiddle with the UK historical temperature records and in doing so ‘get it wrong’ because the UK is ‘geographically inconsequential’.

    You can bloviate about this all the way to the moon and back, if you like, but from the point of view of global mean temperature, any 0.05% area doesn’t have a dramatic impact on the trend contribution from the other 99.95%. Seriously you’re just being an idiot.

    No! No! No! What’s central is that the code has been shown to have demonstrable significant errors and that as implemented, it disagrees with its documentation. Like E M Smith, as someone who makes his living from writing software and who therefore understands that software is useless if it doesn’t conform to it’s written specification, I find some of your statements (like this one) bizarre!

    “Demonstrable significant errors”???

    Actually Smith hasn’t shown that any of his issues matter one iota. That was the point I made and no other.

    As to it’s “completely useless”??? Can you get any more over the top with your rhetoric here?

    One thing you do when studying code that involves numerical approximation is perform sensitivity studies to analyze the relative importance of the different approximations. That allows you to spend time on the parts of the code that matter, and not waste your time polishing code that isn’t critical to function of the code. This is pretty similar to the process of tuning software for speed, where you profile the code to look for which parts of your code.

    By the way I don’t think GISTemp is a complex piece of code at all. It’s just very poorly written and very badly documented

    Keep the BS going and you’ll need a new pair of waders. It’s obvious that you haven’t actually ever looked at the code, now have you?

    Ironically enough the code is well enough documented that it has been replicated by several other groups now. Funny how that works now, doesn’t it?

  120. Carrick said

    Steven Mosher, actually people are putting together a list of things to test. UHI and the homogenization tests are on the list.

    Hopefully you will admit some pretty goofy stuff gets passed off as major problems with GISTemp. If somebody raises the issue of rounding to ±0.5°C, that can be tested in a straightforward manner, and shown to not matter. My point, is one doesn’t need to go back and correct all of the station data and round to ±0.1°C just because somebody thinks it should be rounded to that number. You need to demonstrate that your proposed “correction” affects the outcome, otherwise it’s a pointless exercise in stupidity to go through all of that effort (in this case) to recover the original values of the data.

    The UHI is a another good example, because neither you nor I really know whether the UHI correction should always have one sign or not, or even what having the “wrong sign” is diagnostic of. Like Smith we have real lives, real jobs, families to feed etc. This isn’t our passion, it isn’t what we want to spend the rest of our lives doing. It would be nice if somebody did the definitive study, but if the final answer only gets affected by ±0.03°C/century, nobody will bother.

    From a policy viewpoint it isn’t going to matter much even if you find a real error and correct it. Given that 3/4 of the Earth’s surface is ocean, and that ocean has a net positive temperature trend, it’s highly unlikely that you’re going to find enough corrections on the land record by itself to make that positive trend go to zero. And as long as it is positive and stays that way, it really isn’t going to impact climate policy very much…

    As I see it anyway. Because already the climate data are only very weakly (if at all) suggestive of any need to take drastic action to mitigate climate impact. The proponents have had to exaggerate or lie about just about everything unfortunately…. Himalayan glaciers, Amazon rain forests, effect on hurricanes and violent weather, air borne illnesses, etc. (The list keeps going you know. Did I mention tree rings?)

    So we have a situation where the case isn’t made already. Weakening an unproven case and guess what? You still have an unproven case. It isn’t necessary to exaggerate here to try and undermine them, they’ve undermined themselves (shoes anybody?).

    I have mentioned before that I am an empiricist, to me the art of the measurement matters in and of itself. One should do as well as one possibly can. But what one also shouldn’t do is try and form conclusions about the impact of what you think might be an error (but don’t even know), when you haven’t bothered to do the work to show that it matters a hill of beans. For myself, I find the problems I’m doing more interesting and more likely to make an impact than the conundrum that is urban heating.

  121. vjones said

    #116 KevinUK
    “GISTemp isn’t exactly ‘rocket (NASA) science’ either” LOL

  122. Hopefully you will admit some pretty goofy stuff gets passed off as major problems with GISTemp.

    yes.

  123. Re: E.M.Smith (Feb 1 19:28),
    “and about 86% in New Zealand being airports (all but one that is located on a northern more tropical island: Raoul) , this is going to give a very biased result.”

    I thought I should look up these NZ airports. This is what I found:
    Yhe current NZ GHCN sites are:
    Gisborne
    Christchurch
    Hokitika
    Kaitaia
    Chatham
    Invercargill
    Campbell Island
    Raoul Island

    It’s agreed Raoul Island is not an airport. Let’s look at the rest:

    Campbell Island is remote and uninhabited. There is an automated weather station (used to be manned), and an airstrip on the island. There might be two or three planes landing a year.

    Chatham Islands are about 800 km from the mainland, and have about 800 inhabitants. There is one flight in/out each weekday

    Hokitika is a remote town with about 3000 people. Its airport has about five flights a day, in/out.

    Kaitaia is a remote town of about 5000 people. It’s airport has two flights a day to/from Auckland. There may be some local light plane traffic.

    Gisborne is a larger town, but its airport has 1 paved strip and 3 grass (and a railway line passing through the main strip).

    Invercargill has about 50,000 people, and the airport has 12 commercial flights a day.

    Christchurch is a large town with a busy airport. It is, however, quite spacious.

    Is this such a source of bias?

  124. RuhRoh said

    Hey Jeff

    Sowhadaythink about this guy?

    http://www.bestinclass.dk/index.php/2010/01/global-warming/

    Besides the proglang he chose, is there anything else novel about his analytical approach?

    Is he retracing the steps of others or is he arriving at a similar destination by a unique computational pathway?

    Maybe a new thread for this one?
    I also floated this question at Cheifio’s place…
    RR

  125. Re 123.

    Good catch Nick.

    I think one of the problems ( on both sides) is jumping to conclusionism.

    Here is what I know. temperature measured in 1850 wasnt at an airport.
    If its measured at an airport today, I have to take care to demonstrate that the airport effect is accounted for.

    I cant wave it away and say “it should make no difference” well I can, but then I have to acknowledge that I haven’t
    shown that it makes no difference.

    I can wave my hands about many things and say that it shouldnt make a difference. On the other hand I can also wave my arms
    and argue that certain things should make a difference.

    From theory we can suspect that changing the physical geometry of the place ( smoothing the surface, roughening the surface, adding trees, taking trees out adding buildings, etc ) all of these will change the flow of air, the turbulance, the mixing, the transient temp profile.
    Changing the material properties of the surface ( heat capacity, reflectivity) will also change things in the transient temp profile. This is basic physics. I think a lot of people miss that these are transient phenomena with a lot of spatial variability. That uncertainty at the micro scale
    allows for lots of arm waving. anyways.

    So the source of any bias is going to be the changes in physical geometry and the changes in material composition. How big is the bias?
    dunno. Which direction? Dunno. Can it safly be ignored? dunno.

    Who has to prove that the bias is inconsequential?

    you.

    is a claim that there is substantial bias supported? No.

    I’m in ignorance. Theory tells me the temperature might not be the same given the changes in the physical geometry and
    material properties. EM and and others want to say the bias is obvious. You want to say what is the source of the bias or the bias
    should be small. neither one of those positions impresses me much.

    I imagine that sometimes airports give high biases, sometimes no bias. It also probably changes over time.

    On the other hand if I start measure at an airport and nothing else changes, then I’m happy. Cause I only care about the trend.
    So station history matters. When did it become an airport?

  126. vjones said

    #125 Steven Mosher,

    I came back here to look over the anomaly stuff for EM Smith’s Madagascar thread

    I agree wholeheartedly – nothing can be assumed.

    One airport I looked at was Gardermoen (outside Oslo, Norway):

    “Gardermoen became a military airfield in 1920, but was upgraded, after WWII bombing, with two separate runways, eventually handling intercontinental flights then civilian charter flights in 1972. After another major upgrade it became Oslo’s main airport in 1998, serving more than 19 million passengers in 2008. Gardermoen’s temperature record shows an overall warming (ΔT) of 3.73⁰C/100 years, but there is a strong cooling trend until the late 1980s around the time when the construction work for the airport upgrade was underway. After that there was a temperature jump of 1.5⁰C and a strong warming trend.”

    So this was an airport when the temperature record started, but it has grown substantially amd that MAY be the reason for the change; I haven’t even looked at equipment location or type. Even then I’d prefer more detail – is the warming in Winter (deicing of runways); have there been trees planted etc.?

    When compared with Blindern (Oslo University, in an open park like area of the city, I think), Blindern shows a lesser cooling trend that is maintained.

  127. vjones said

    Oops, screwed up the html – sorry. link to Blindern is here: http://1.bp.blogspot.com/_vYBt7hixAMU/Sv30XSvMcjI/AAAAAAAAAF8/l441BUWy8uw/s1600-h/Blindern.png

  128. Re: steven mosher (Feb 4 14:16),
    Who has to prove that the bias is inconsequential?
    you.

    I disagree. AGW follows from gas radiative properties and the rise of CO2. The thermometer record is not part of its proof.

    Persons of a denying tendency like to focus on measured temperatures, hoping I guess to find a weakness. I think that’s a mistake. AGW theory did just fine back in the 70’s and 80’s, when I first encountered it, when there was no GIStemp, and really no accessible global temperature at all. Now the temp record is consulted to answer the folks who say
    “If you’re so smart, how come it ain’t warm?”
    If the answer is, well the temp record isn’t good enough to tell, then that objection fails. You’re left to answer the radiative physics.

  129. RomanM said

    #128

    AGW follows from gas radiative properties and the rise of CO2. The thermometer record is not part of its proof.

    I disagree. This is a naive view.

    A major portion of the debate is the amount of warming to be expected. Virtually the entire case rests on incompletely understood feedbacks from humidity, clouds and other sources so the radiative physics is not sufficient without the empirical observation of temperature changes. The accuracy of the temperature record is an integral part of the argument.

    If “the temp record isn’t good enough to tell”, then the burden of proof still rests on the proponents of major change.

  130. Nick Stokes said

    #129
    “If “the temp record isn’t good enough to tell”, then the burden of proof still rests on the proponents of major change.”
    There isn’t much point in haggling about the burden of proof. The job of scientists is to investigate the science as best they can and report it. Some will say that AGW is a worry, and seek action in the absence of refutation. Some will say spending money is a worry, and demand proof beyond reasonable doubt. Those are just the viewpoints of parts of the public, and scientists can only say – “we report, you decide”.

    And yes, the temp and paleo records are needed to help substantiate feedback estimates. That’s not the use which is usually scrutinised, though.

  131. Carrick said

    Nick Stokes:

    Persons of a denying tendency like to focus on measured temperatures, hoping I guess to find a weakness. I think that’s a mistake. AGW theory did just fine back in the 70’s and 80’s, when I first encountered it, when there was no GIStemp, and really no accessible global temperature at all. Now the temp record is consulted to answer the folks who say

    People of the “denialist” sort I find also don’t like admitting that the science says we can neglect the warming prior to the mid-1970s as natural (due to the approximate balancing of anthropogenic CO2 and sulfates prior to then).

    Since that period, we have overlap between ground based and satellite data, and they all tell basically the same story, which is there is a gradual, but steady climb in temperature over a multi-decade scale.

    The other thing they don’t like is is the observation that the actual warming predicted by CO2 is a very gradual long-term thing, one that for now looks more like loaded dice in the game called “natural climate variability” than a real offset. I don’t know that we can attribute the warming we are seeing this January to man-caused climate change, but I can pretty well guarantee that even if we stop pumping excess CO2 into our atmosphere, the multi-decadal temperature will be warmer then than it is now.

    And like Nick says, you don’t need a ground-based thermometer to tell you that. The proof lies elsewhere.

  132. I simply got to this amazing web-webpage a few weeks ago.
    I had been in fact taken with all the bit of sources you have got right the following.
    Big browses upwards in making such fantastic blog site
    web site!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
Follow

Get every new post delivered to your Inbox.

Join 148 other followers

%d bloggers like this: