the Air Vent

Because the world needs another opinion

The state of climate temperature data.

Posted by Jeff Id on May 11, 2010

Bart Verheggan left a comment on the recent global land temperature thread which supports Eric Steig’s comment, that one of the reasons scientists don’t have good surface temperature data, is because it’s unglamorous.

Eric Steig made some very insightful comments here. I have the same experience as he describes:

– monitoring and data archiving is deemed uninteresting by most scientists
– it’s difficult to get it funded by the regular scientific funding channels.

Part of the problem lies with how scientific work is valued: Largely by the quantity of papers. Monitoring and data archiving are therefore a very poor investment in the way science is currently set up. That is indeed a problem, and I know lots of scientists who argue for more monitoring (e.g., and many scientists who loath the very narrow minded criterium of the number of papers.

The link Bart provided has to do with CO2 monitoring, however, it is applicable to several areas of  climate science and worth reading.  One of SteveMcIntyre’s main concerns has been reasonable archiving of paleoclimate data.  Many of our concerns have revolved around the lack of data and quality control for surface temperatures, others have spent many keystrokes lamenting the quality of CO2 measurements or ocean pH.  Climatology is different with respect to other sciences in that we’re measuring something very big and in nearly every case, the instruments are poorly suited to carry out the work.  This does not mean the data is unworkable, but it does mean that the best quality control is required.

I think climate scientists see skeptic complaints about these data as obstructionist, preferring to use bulk averaging methods to say it will come out in the wash.  Some critics and bloggers overstate the problem in comment threads which doesn’t help the issue but I believe both are wrong to some extent.  No matter what we do, surface temperatures will show warming of the planet, the temperature rise may even be of the same magnitude, but there is plenty of room to imagine that a fair chunk of the warming we see is caused  by UHI effects.  Currently, with instrument data in the state it is, we cannot tell the true magnitude of CO2 warming by simply processing the data.  We have to make assumptions of quality (i.e. reasoned guesses).

Jones, participated in a study which was widely critiqued on UHI due to its unsupportable claims of instrument quality.  Their conclusion was that UHI wasn’t a factor, however the study was incomplete and um…unconvincing.  Eric Steig wrote an RC post where 60 random stations were selected that produced the same trend as Crutem3 land data.  Again, we don’t know where these stations were from, and the selection process could have led to selecting city centers.

We know that UHI caused warming of many thermometers, simply by understanding that UHI exists.  It is warmer inside city centers than in the countryside, and these city centers were tiny a hundred years ago.  Therefore we have at least some warming of the surface measurements which is completely unrelated to CO2 driven climate change. If Eric Steig’s RC post accidentally selected all city centers (and I suspect it did), and his result matched CRU, we could reasonably conclude that UHI is a significant factor in CRU data — but how much would still be unknown.

Recently there has been a massive dropout of GHCN surface stations on record.  More than 2/3 of the stations recorded since 1990 have vanished from the record.  Why and which stations are poorly understood.  There is potential for the missing data to have measured greater warming or less, we don’t know but 2/3 is enough to bias the record.  Some skeptics have uselessly tried to divine the difference by region or finding a few stations, but without funding and cooperation from the climate community there is little hope.  Some scientists have uselessly tried to say the record is good by similar methods.  This situation is alarming to those of us who don’t share the excitement for the political solutions presented by the UN and IPCC while it’s a shoulder shrug for those who prefer the result.

Like many here, I see the politics of the solutions as highly destructive and the lack of concern by the climate community as misguided.  I’ve got a ton of experience measuring unglamorous data, at least enough to know that what you expect is not always what you find.  Were this not the case, measurement would be a lot less important.  Sure taking data isn’t glamorous work, but it is absolutely, positively, necessary.  This is more important than glamor.

Eric Steig made this point.

In any case, I suspect that Phil Jones did about as good a job as can be done, and it’ll be an utter waste of time to redo it, especially if it doesn’t change the credibility problem at all. And I don’t see how it will. I mean, if the results show less warming, maybe people will believe it, but that’s not likely since CRU’s work shows less warming than anyone else. And what if the MET office finds that actually CRU greatly underestimated the warming? Then what? You really think that all of a sudden the ‘numbers’ will change people’s minds? That’s pretty naive. And please don’t tell me the difference will be ‘transparency’. Someone will always find something minor to complain about, and blow it completely out of proportion. So even if the work is totally ‘credible’, those that don’t like the answer will find a way to make it appear ‘not credible’.

As natural skeptics of big claims, most of the readers here disagree about Jones doing the best that could be done, and even about the wasting of time to redo it.  Declarations of accuracy are a bit of crystal ball work IMO but he may turn out to be completely right.  In any case, we won’t know from these records in their current condition.  He is right on his point about people still finding ways to criticize whatever the result is, there are always critics.  This has been occasionally used at the embattled RC to justify the arguments against skeptics.

There is one very big difference between the current situation where we have appropriate and deserved criticism of the climate measurements today and one where the data is properly recorded and open for examination.  Were the records clean (within reason guys) and raw data openly available, those remaining who were still critics of climate data…,

wouldn’t be right.

The climate community, after climategate,  is passing on a singularly unique opportunity to change tack and approach the science in a supportable fashion.  Scientists do mundane work all the time, it’s past time we had a proper structure for long term recording, quality control and storage of global climate data, we can start with temperature.

73 Responses to “The state of climate temperature data.”

  1. DeWitt Payne said

    If it’s important but doing it is boring, then it shouldn’t be done by academics at all. We need the equivalent of the Bureau of Labor Statistics.

  2. Fred N. said

    UHI effect in action in Ottawa, Canada:

  3. Tom Fuller said

    This is really both a good and important post. I’m extremely pleased that Dr. Steig and Bart both engaged on this issue and hope this can continue.

  4. j ferguson said

    Dr. Steig says: “You really think that all of a sudden the ‘numbers’ will change people’s minds? That’s pretty naive. And please don’t tell me the difference will be ‘transparency’.”

    Is this really the point? Changing people’s minds? How about the thing itself, better accuracy?

    Clearly there has to be a motivation to spend the money to rebuild, or build anew these datasets.

    How about improving our grasp of the matter?

    Forgive me, Dr. Steig. I suspect that your view is the more practical, or realistic.

  5. Carrick said

    It may be “unglamorous”, but there has been plenty of money poured into climate monitoring. So lack of money isn’t really the issue.

    And Phil Jones certainly could have “done a better job”. What he actually did was in some respects quite cursory, as the released CRU code demonstrates. That was a lot of NSF funds for what wasn’t in many respects a very engaged effort. Excuse my cynicism, but I wonder how often when people refuse to release their code, it’s because it would air their dirty laundry in public?

    And oddly for all of that “unglamorousness”, there have been maybe a dozen replications of the global mean temperature. Oddly there are plenty of people who find this “unglamorous” task interesting enough to have bothered replicating that.

  6. Pat Moffitt said

    When environmental “crises” are pretested, prepackaged and pre-sold – there is no need for data! Data is at best an annoyance and a worst a threat.

    Environmental NGOs test market “potential” crises using direct mail regional ad campaigns. Those campaigns that resonate with the public (via donations) are developed into national campaigns. A national campaign commences the full advertising and public relations push complete with press releases. Media then heralds the crisis. EPA and Congress react to public perception and grants start to flow with a goal to “prove” the crisis and heighten public alarm. Multiple interests see the opportunity and the threat of the new crisis and begin to organize. The first message heard when the public perceives a crisis controls the narrative. Soon as the public perceives the crisis the debate is declared to be over. The crisis becomes increasingly political and as such can never be found incorrect. EPA declares a new program to address the crisis, lawyers sue those responsible for the crisis, academics who jump on early get tenure. The crisis becomes a political battle– potential winners and losers pour vast sums of money into PACs and allied NGO campaigns. Political deals are struck with rent seekers, news media is selling the advertising rights to the political spectacle and the crisis becomes associated with belief systems. Science is simply a pawn in this game.

  7. Kenneth Fritsch said

    I do not for a moment believe the reason behind the keeping and statistical analyses of temperature data sets is not given the attention it deserves by climate scientists stems from it being boring , uninteresting or unrewarding. It gives most climate scientists, and particularly those in the consensus camp, the trends that agrees fairly well with their scientific and advocacy consensus positions. If someone did an analyses that was counter to the consensus view, it would provoke a very large and detailed study from the consensus scientistsn – their interest would suddenly soar. I would also bet that it would put more uncertainty on the current data sets than had been previously established.

    What I find most unsettled about these temperature data sets is assigning something approaching a quantitative uncertainty to the data. I have been reviewing the most recent works and attempting to understand better the approaches to these assignments by climate scientists and statisticians. I even had to learn how to do PCA and was surprised how intuitive the methods are once you start messing around with the data and attempt to make some sense out of it – not that PCA cab be used without its own set of caveats. The one thing I want to do is to determine how stationary the covariances of stations within a grid are over time. I also wanted to decompose the data by categories rural, suburban, and urban.

    I was thinking that some of this data manipulation might be nothing more than curve fitting and was a bit surprised to see Shen, in Shen et al. (2007,) make a statement in the paper to the effect that while what they were doing to determine uncertainties in monthly 5 X 5 degree grids may seem like curve fitting, it really was not. They also brought up the stationarity problem as it applied to earlier work done by Phil Jones and that Shen felt they had overcome by using monthly, as opposed to decadal, temperatures. The paper also makes rather clear that to do their type of analysis that certain assumptions have to made, e.g. certain grids well popuilated with stations yield a true mean temperture over time and space and/or that the same true mean temperature and in a grid can be obtained with a climate model.

    Jeff ID, I think that when you bring up the UHI effect it is important to mention the considerations of those scientists, who tend to minimize it, are putting forth as arguments. They do not deny that the UHI effect exists and can be quite large, but their point will always be that it has not changed significantly enough over time to significantly affect the trend calculations. They will also cite the mitigating changes of urban stations migrating from the “city” to surrounding airports. If one follows what they are proposing (hand waving?) it brings out the importance of the micro climates in the locale of the stations.
    This gets back to the work and findings of the Watts team CRN ratings for USHCN stations in the lower 48 states. It brings forth the issue of what a good statistical analyses of these CRN rating findings will reveal. I also agree that micro climate will overwhelm changes in UHI, but that it has been pretty well ignored by scientists (who tend to prefer doing their analysis in their ivory towers). I also have my doubts about change point analyses of the station data finding these changes in temperatures due to a slowly changing micro climate. I would think that some of these effects could be studied with a computer simulation of micro climates.

  8. From the NASA/GISS email releases, 417760main_part4.pdf, Page 54, Phil Jones has some interesting points:


    “Re: Recent events

    National Climate Centre in Melbourne.

    3. We’ve got all the long NZ series they have homogenized.
    Problems with both Australia and NZ associating these with WMO IDs we had.
    Why it’s always the English speaking countries is odd? Maybe this is because
    we can find out/understand more easily what they’re doing!

    4. My biggest worry is China. CMA don’t measure at airports, and they keep
    moving suburban locations a few more miles out as the cities expand. I was
    there a month ago to give some talks. I’ve sent them all the CRU data for
    China, in the hope that they will reciprocate at some point and send me
    their adjusted data (for site moves, but not urban influences).
    They are doing some reasonable work, but not seeing the big picture…

    Other issues:

    1. I reviewed a paper by NCDC (Smith/Reynolds! Peterson) recently. It was OK,
    but when it comes out it will raise the whole debate again. SSTs are being
    increasingly measured by buoys (drifting and fixed) and they now
    dominate over the ships. It seems they are about 0.1 -0.2C cooler
    over the ships. So NCDC will be increasing global temps from about 2000

    2. SSTs are now coming in for the areas losing Arctic sea ice. The normals
    we have for these are -l.8C which is completely wrong. Shortish time series
    are composed of entirely positive anomalies. Maybe this is true, but it
    probably shouldn’t be as much as it is. This problem will get worse as
    the sea ice continues to go. Your use of land only data shouldn’t have
    the problem.

    The SST issues highlight that it is the biases (bucket/intakes and
    that are important as they are potentially pervasive. Individual station
    homogeneity issues cancel as sites are all affected differently. Getting
    this right has hardly any effect (none in fact) on the large-scale averages.
    Might affect smaller regions, and it’s good to get as many right as possible,
    as the deniers will claim if one is wrong the whole lot is wrong. The law of
    large numbers seems to be totally forgotten by those collecting pictures
    of siting across the US. Still it gives them something to do…

    Phil ”

    I think we all know who the ‘picture collectors’ are. As you can see, there are several revelations in that text.

  9. timetochooseagain said

    Not only is scientific funding unwisely directed toward volume of papers, but also rewards failure more than success. If a problem is “solved” or an issue found not to exist, the funding stream disappears.

  10. j ferguson said

    On funding as a follow up to failure, I take it you mean “We need more money, we’re not quite there yet.”

    I’ve begun to think that Mann and Briffa have done the world a great service is giving the full court press and more to the idea that tree rings could somehow provide proxy temperature signals reaching into the uninstrumented past. And despite their best efforts could not make it work.

    So maybe we can conclude that this wasn’t a good place to put further grant money. So this failure would be the end of that funding stream – but did anyone get tenured?

  11. Kenneth Fritsch said

    Individual station homogeneity issues cancel as sites are all affected differently. Getting this right has hardly any effect (none in fact) on the large-scale averages. Might affect smaller regions, and it’s good to get as many right as possible, as the deniers will claim if one is wrong the whole lot is wrong. The law of large numbers seems to be totally forgotten by those collecting pictures of siting across the US. Still it gives them something to do…

    Phil Jones here is demonstrating hand waving at its best (worse). How in the hell would he know that the differences cancel unless someone has done a physical site study and a good statistical analysis of the results. he is, like many of his associates, always in a big hurry to write off any problems as offsetting. He knows that almost all the “official” adjustments to temperature data are in one direction so how does that square with these off the cuff statements that he makes.

    Question: Why are these emails easier to explain as being written by advocates without any basis in science? Also if Dr. Phil would like some suggestions that might keep him busy (and productively) how about looking at those local/regional problems instead of hand waving them away. I hate to say this about an esteemed academic, but his work, as we know it, would tend make one think he is sloppy and lazy.

  12. timetochooseagain said

    10-Yes, that’s exactly what I mean.

  13. John F. Pittman said


    JeffID, WUWT has posted what appears to be the proposed American Energy Liberman-Kerry outline. Some are wondering if it is real. If not, someone knows how bills are floated and it has the correct language.

    It also is a huge tax that they claim is not. It ought to be round filed for the lie it tries to pull off. Though, I doubt anyone who is not a beleiever or has better than a 7th grade education will be fooled. It is a huge tax. How huge depends on how much money Washington spends. I have seen few limits to that.

    So, I guess this bill should be called The American Transfer of Power and Forever Tax Act.

  14. WillR said

    Doesn’t this quote from the reader background say it all…???

    When I was a student, the people who gravitated towards data collection were those who couldn’t cut it in the more fashionable hard science. The plodders. As circumstances thrust them into the scientific limelight, their lack of real ability became only too obvious.

    Too many people view the data collection and theory verification stage as mere “plodding”. Sorry to pick on someone — I wanted to do it then, but did not want to disrupt the thread. Now is a more opportune time.

    A Professor of Mathematics (tenured) made an almost identical sneer to me when we were working together on a project — funded by my money. Well modeling of NP processes is a little tough and as we were writing algorithms for independent agents that cooperated that made it all the tougher. Without the plodding and bookkeeping there was no way to validate the many models. After thinking about it I decided that if he would not assist in that part he was probably too good for us and would be happier with other funding and other projects.

    Since we were able to achieve a 2/3 gain in process efficiency I thought then and still do now that a little bit of plodding can be good for the soul. Incidentally the models for the agents — NP problems were my algorithms — I was simply asking that the university team improve them.

    You need both and you need stellar, conscientious work on both sides — theory and proof when working with large models in the NP space.

    I could say a lot more but that should suffice. Apologies to the person I quoted if I have offended you.

  15. Steve Hempell said

    For the past month or so, I have been looking at the global temperature record (GHCN mean ver 2) using at first Jeff”s posted script and then Nick Stokes posted script (mostly ver 1.4). I can assure you, there are hours (days, months?) of very interesting work that can be done with these scripts.

    I have sliced and diced the data in many ways in that time and can think of many more ways I want to look at the information. I have looked at the data by latitude, by RSS satellite latitude, by arctic “chunks”, by country, by region, by continent, by different period trend, by Canadian Province. I have a printed temperature record (1881 to 2009) of every country that appears in the inventory database. I have used that to create a Global Temperature Trend map from the 84 best countries historical temperature records. Of the 206 countries in the GHCN inventory only 41% of them have a complete temperature record from 1881 to 2009. The other 59 % are made up of countries having records only suitable for the first part of the century (15%), the latter part of the century (24%), and pretty much useless for anything (20%).

    Observations so far:

    Using only the best countries (6042 stations overall), the temperature has increased in two stages since 1881. The overall trend was 0.068 Degrees C per Decade.
    The first stage the temperature trend was 0.125 deg C/ dec (trend 1901 to 1940) ; the second 0.250 deg / Dec (1970 to 2009)
    No real surprises there. However, it is a little less dramatic to see the increase from 1881.

    The arctic is interesting. Greenland, Iceland,Faroe Islands show that the temperatures now are little different than the 1930/40s. “Chucks” of northern Canada/Alaska (West,Mid,East) and NEurasia show that again, the 30s/40s were similar to now. Some areas were warmer in the 30/40s some are warmer now. Greenland’s 1914 to 1940 trend was dramatic.

    I have taken each countries 1901 – 2009 trend and color coded it in rworldmap (a work still in progress). There are countries that have become colder in the last 110 years (according to this data). Chile, Italy, Netherlands, Czech Republic, Malaysia. Some have warmed very little – Greece, Turkey, . Ones that have warmed the most – Canada, Russia (Asia), France, Tunesia.

    Checking this data with satellite pre and post 1998 El Nino. The trends are not the same especially UAH. Why? Comparing apples with oranges?

    Trying to analyse Canadian data, especially by province turned into a nightmare. The provinces abbreviations are different. BC for example were – BC, B.C., B, B. and nothing at all. A lot of Ontario’s places were labeled CANADA. The US has no state abbreviations at all (but this info is available in another database). A sloppy database from this point of view.

    The ways of looking at the data is limited only by your imagination. With the scripts generated by Jeff, Nick and others, and if your good with code (I’m not) there is a lot of interesting information available to you and a lot of different and interesting way of looking at it..

  16. TA said

    So the excuse for not addressing valid criticism (about the climate record) is… after it is addressed there would still be invalid criticism. I get it.

  17. Carrick said

    #15, interesting comments!

  18. Jeff Id said

    #15 It sounds like an amazing amount of work, I’m curious about specifics. Can you narrow it to an interesting region so that the rest of us can check and confirm what you’ve found?

  19. steven Mosher said

    I second what steve has to say. ive spent a couple of days now on the antarctic data.

    What a mess. No the data isnt “wrong” but you have missing WMO numbers, wrong WMo numbers, different name spellings, different metadata. Now of course GISS has some files here and there that document all the little fixes, but you would think that BAS would just go fix the posted metadata. Its less than a days work. There are also incredibly dumb things they do with the data. Like posting it with color coding to indicate the quality of the data. Seriously how hard would it be to post up a csv file. how hard to actually post the lat/lons in a standard format? or a ghcn format for the data.

  20. FINN said

    Jeff, you mentioned that we dont know if that 2/3 of lost thermometer data will cause a warming bias or not, it is very clear to me that is not the reason for the warming in the thermometer data.

    But what I STRONLGY suspect is, that this remaining 1/3 has the warmest of rural-stations and coldest of urban-stations so the UHI-effect will be “hidden” between the trendes between these two.

    So the loss of thermometer has to do with the UHI-effect rather than the trend in temperature rise.

    Have you ever thought or looked in to this?


  21. stan said

    What we have here is gross incompetence run amok. The data collection is incompetent. The folding, spindling and mutilating of the data before it is stored is incompetent. The data storage is incompetent. The statistical studies of the data are, more often than not, incompetent. The software produced to do studies is incompetent. The various compilations and assessments that these scientists have produced are incompetent.

    Steig is right that better data won’t make any difference. As long as the incompetents are in charge, the quality of the science won’t improve unless a whole lot more than the data is cleaned up. Hercules faced an easier task when he had to clean the Augean stables. There was less horse —- to clean out.

  22. tonyb said

    Steve Hempell #15

    I ‘collect’ instrumental temperature records prior to 1880 and other records-diaries/observations- that go back to Roman times.

    It is quite apparent that following the peak of the MWP that there were a series of climatic down turns and we can then trace the instrumental record-in the case of CET- from 1660.

    With some notable peaks and troughs, some bits of the earth appear to have been warming since around 1690 (The coldest period of the LIA). There was a considerable upturn around 1700 for thirty years followed by a slight decline (and many cold interludes) but overall the trend has been up for 320 years.

    Giss measures from 1880 when there was the last of the LIA interludes, so it is measured from an artificially cold base line. However, if you look at the Cru and Giss figures all they demonstrate is that they have plugged in to the end portion of this very long upturn.

    UHI in cities plays an obvious part in local warming and occurs in the type of location which constitutes a large part of the Giss database these days. Add in the complication that a station move means a change from one micro climate to another, and often to a warmer one, and the 320 year warming trend is probably reduced in overall scale, but still apparent.

    If the overall scale trend is reduced, the many warmer periods in the LIA will probably become even more significant in extent than they now appear.

    That we shouold be warming since the LIA should surprise no one-the relatively gentle scale of the recovery and the long period it has been happening over may be more surprising.


  23. tonyb said

    Steve my#22

    By the way I meant to mention that there is plenty of evidence as well that bits of the world have been cooling for a long time. I suspect the bits that have warmed are greater than the ones that have cooled and so by taking a crude temperature average that may cause an overall warming in ‘global’ temperatures.

    However what is for certasin is that the warming is not global in extent, nor catastrophic, which is probably why there has been a switch to ‘climate change.’


  24. JR said

    Re: #15

    There are other issues with GHCN as well. For instance, for the U.S. stations prior to 1951, the monthly means in v2.mean (which is GHCN raw data) are not actually raw data. They are adjusted from (tmax+tmin)/2 to the equivalent of a mean of 24 hourly observations, and for the vast majority of the stations, this adjustment is downward. This adjustment stopped in January 1951, so adjusted and unadjusted data are “stitched” together in the timeseries of these U.S. stations. For Eureka, Calif. which has one of the most extreme downward adjustments in the pre-1951 data, the trend changes from about 1 deg/century when the v2.mean data are used as-is to less than 0.5 deg/century when the pre-1951 data are adjusted back to (tmax+tmin)/2.

  25. Steve Hempell said


    I would be willing to give you the scripts and databases I have generated for the arctic and the “84 best countries” for a start. It would be nice to get what I have done checked by people more competent in code than myself. I have some time to work on this in the next few days, so I could organize all the different things I’ve done and sent it to you. Just the scripts and databases, not all the plots etc!! If people want to check the script out they will have to download the stuff from Nick’s website to modify the GHCN database for version 4. By the way, Nick has been absolutely great in helping me out with the scripts. A very patient guy!!

    Also, I did send you some stuff on rworldmap. However, I have improved on that since although I am still mystified on how to make the maps better and easier to compare.

    I gave up for now on the Canadian provinces (for now). If I tried to correct the abbreviations it messed up the databases.

  26. A C Osborn said

    Kenneth Fritsch said
    May 11, 2010 at 3:25 pm
    Phil said
    Individual station homogeneity issues cancel as sites are all affected differently. Getting this right has hardly any effect (none in fact) on the large-scale averages. Might affect smaller regions, and it’s good to get as many right as possible, as the deniers will claim if one is wrong the whole lot is wrong. The law of large numbers seems to be totally forgotten by those collecting pictures of siting across the US.

    Of course “Getting this right has hardly any effect”, the homogenized data is then fed in to the “Grid” system, has anyone really thought about why they use the Grid system and don’t just average all the Station data?
    It is to further average out the data, so instead of the “errors” and changes having a proper influence it is lost averaging averages.

    The work that Steve Hempell is doing in #15 sounds very similar to the work done by E M Smith.
    The one thing that has become very obvious is that when Individuals obtain the data for their local Stations they bear no resemblance to those held by NCDC, GHCN, GISS, CRU and Hadcrut and even the National Repositories, just look at the Australian and New Zealand data as examples.

  27. Jeff,

    The choice of stations that were “dropped” or not “dropped” isn’t particularly mysterious. Monthly updating data exists post-1992 for GSN stations that release monthly CLIMAT reports and for USHCN stations. Stations not part of GSN or USHCN aren’t updated in GHCN post-1992. These stations will get updated records when GHCN v3 comes out next year.

    Also, despite quite a bit of searching, we have yet to find a substantive case where station dropout affects temp records at anything but a local scale.

    As far as UHI goes, combine the work you, Nick, or I have done on temp reconstructions with the metadata improvements that Ron Broberg and Mosh are working on and test it out. Try only rural stations, only dark nightlight stations, only non-airport stations, only non-coastal stations, or a combination of all of these and see if the trend changes vis-a-vis urban stations in the same gridcell. From the early work I did on this over at Lucia’s place UHI seemed to be real but relatively small (~10% or less) at a global land temp level. That said, I hope more folks explore different ways of testing for UHI, as my methods were far from perfect.

  28. Steve Hempell,

    Try defining the province by its rough lat/lon dimensions, that should give you a pretty good picture of BC. I did something similar to limit my analysis to CONUS (well, coupled with the country code restriction).

  29. Jeff Id said

    #27, I’ve also run the UHI tests from just the metadata records but even stations classified as rural often have new construction or are located next to a/c, blacktop etc so the 10 percent is probably lower than actual. I wouldn’t be surprised to see 20% in a better study.

  30. Jeff,

    For specific cases or countries 20% is certainly possible, but I doubt it would be that high globally. That said, the fun part will be finding out, and our a priori predictions are mostly useless since they tend to be influenced by our preconceptions.

    My early pairwise approach found a ~20% difference between Urban/Rural or Bright/Dark:, though that becomes a ~10% difference from the mean of the two.

  31. I should add the caveat that it will depend a lot on the period you are looking at.

    For example, UHI in the CONUS is probably ~15% for 1900-present but < 5% from 1960-present. I'd imagine in China you would see the opposite.

  32. steven Mosher said

    Steve H.

    For a proper location of stations, we will need to build a function that returns

    a province for a given lat/lon. I started down this path, but it requires me learning some GIS stuff and perhaps using

    On occassion I take sometime looking for a quicker dirty way to do it, but in the end that will be wasted effort

    resources like this:

    Ron B is working on a alternative metadata sources now and doing some very interesting stuff. As it stands the Metadata
    in GISS ( v2.inv ) is just a sloppy piece of work. Now, people shouldnt rush off and say global warming isnt real because the metadata has some warts, but it does make doing studies hard..

    If it were cleaned up then doing some exhaustive slicing and dicing would be very simple. As it is, its a lot of hand work.

    Anybody who has GIS skills ( and R skills) should hit up Rons site and pitch in. Also, Google earth Skills.

  33. steven Mosher said

    Steve H.

    maybe this

    I have looked at the R ‘sp’ package, but havent spent any time.

  34. steven Mosher said

    no joy:

    Country: Canada
    Format: R data
    Data not available
    Division Level 1
    Division Level 2
    Division Level 3

  35. steven Mosher said

    lots of stuff. not enough time

  36. steven Mosher said

    and there are historical versions going back to 1700 FWIW

  37. Steve Hempell said

    If anyone wants to run some of my stuff on Nick’s GlobTempLSv1.4.r script here are the changes needed to run the global best and the Arctic. You will have to download Nicks version 1.4 available here and do the prelim work to get the modified database if you do not have it.

    For Global by best countries (also regions)

    #title = c(“Global”)
    #title = c(“Only Africa”);
    #title = c(“Only Asia”);
    #title = c(“Only SA”);
    #title = c(“Only NA”);
    #title = c(“Only Oceania”);
    #title = c(“Only Europe”);
    # choices = 1

    #”Global” = tv$country==101 | tv$country==115 | tv$country==119 | tv$country==122 | tv$country==124 | tv$country==125
    #|tv$country==126 | tv$country==127 | tv$country==129 | tv$country==130 | tv$country==137 | tv$country==138 | tv$country==141
    #|tv$country==148 | tv$country==149 | tv$country==152

    #|tv$country==205 | tv$country==206 | tv$country==207 | tv$country==210 | tv$country==216 | tv$country==219 | tv$country==221
    #|tv$country==224 | tv$country==228 | tv$country==229 | tv$country==231

    #|tv$country==301 | tv$country==303 | tv$country==304 | tv$country==307 | tv$country==308 | tv$country==313 | tv$country==314

    #|tv$country==406 | tv$country==414 | tv$country==423 | tv$country==424 | tv$country==425 | tv$country==431

    #|tv$country==501 | tv$country==503 | tv$country==505 | tv$country==507 | tv$country==509 | tv$country==525

    #|tv$country==602 | tv$country==603 | tv$country==606 | tv$country==605 | tv$country==609 | tv$country==610 | tv$country==611
    #|tv$country==612 | tv$country==613 | tv$country==614 | tv$country==615 | tv$country==616 | tv$country==617 | tv$country==618
    #|tv$country==619 | tv$country==620 | tv$country==621 | tv$country==622 | tv$country==623 | tv$country==628 | tv$country==630
    #|tv$country==633 | tv$country==634 | tv$country==635 | tv$country==636 | tv$country==637 | tv$country==639 | tv$country==643
    #|tv$country==645 | tv$country==646 | tv$country==649 | tv$country==650 | tv$country==651 | tv$country==652 | tv$country==653,

    ########By continent or region

    #”Only Africa” = tv$country==101 | tv$country==115 | tv$country==119 | tv$country==122 | tv$country==124 | tv$country==125
    #| tv$country==126 | tv$country==127 | tv$country==129 | tv$country==130 | tv$country==137 | tv$country==138 | tv$country==141
    #| tv$country==148 | tv$country==149 | tv$country==152

    #”Only Asia” = tv$country==205 | tv$country==206 | tv$country==207 | tv$country==210 | tv$country==216 | tv$country==219
    #|tv$country==222 | tv$country==221 | tv$country==224 | tv$country==228 | tv$country==229 | tv$country==231

    #”Only SA” = tv$country==301 | tv$country==303 | tv$country==304 | tv$country==307 | tv$country==308 | tv$country==313 | tv$country==314

    #”Only NA” = tv$country==403 | tv$country==406 | tv$country==414 | tv$country==423 | tv$country==424 | tv$country==425 | tv$country==431
    #| tv$country==435

    #”Only Oceania” = tv$country==501 | tv$country==503 | tv$country==505 | tv$country==507 | tv$country==509 | tv$country==525

    “Only Europe” = tv$country==602 | tv$country==603 | tv$country==606 | tv$country==605 | tv$country==609 | tv$country==610 | tv$country==611
    #| tv$country==612 | tv$country==613 | tv$country==614 | tv$country==615 | tv$country==616 | tv$country==617 | tv$country==618
    #| tv$country==619 | tv$country==620 | tv$country==621 | tv$country==622 | tv$country==623 | tv$country==628 | tv$country==630
    #| tv$country==633 | tv$country==634 | tv$country==635 | tv$country==636 | tv$country==637 | tv$country==639 | tv$country==643
    #| tv$country==645 | tv$country==646 | tv$country==649 | tv$country==650 | tv$country==651 | tv$country==652 | tv$country==653,

    For the Arctic
    title = c(“West NA Arctic”,”Mid NA Arctic”,”East NA Arctic”,”Greenland”,”Iceland”, “Faroe Is”, “Sweden”, “Norway”, “Finland”,”West NEur Arctic”,”Mid NAsian Arctic”,”East NAsian Arctic”);
    choices = c(1,2,3,4,5,6,7,8,9,10,11,12)


    “West NA Arctic” = tv$lat>60 & tv$lat -170 & tv$lon 60 & tv$lat -140 & tv$lon 60 & tv$lat -110 & tv$lon 60 & tv$lat 5 & tv$lon 60 & tv$lat 60 & tv$lon 60 & tv$lat 130 & tv$lon < 180,

    Interesting thing happened when I prepared the Best Global part. I noticed that I had missed 3 countries Cook Is; Trinidad, and Russia (Asia). When I added these the trend went from 0.068 to 0.074. If I took Russia away it dropped to 0.068. If I took Canada away too, it dropped to 0.063.

  38. Steve Hempell said

    Oops, something got messed up on the Arctic part. Try again


    “West NA Arctic” = tv$lat>60 & tv$lat -170 & tv$lon 60 & tv$lat -140 & tv$lon 60 & tv$lat -110 & tv$lon 60 & tv$lat 5 & tv$lon 60 & tv$lat 60 & tv$lon 60 & tv$lat 130 & tv$lon < 180,

  39. Steve Hempell said

    Sorry Jeff. 0ne last time.


    “West NA Arctic” = tv$lat>60 & tv$lat -170 & tv$lon 60 & tv$lat -140 & tv$lon 60 & tv$lat -110 & tv$lon 60 & tv$lat 5 & tv$lon 60 & tv$lat 60 & tv$lon 60 & tv$lat 130 & tv$lon < 180,

  40. Jeff Id said


    You ought to give a crack at a short writeup of something interesting you found. I really don’t have time to review and write up others work, but doing all the work deserves to be read. It’s up to you of course.

  41. Carrick said


    Also, despite quite a bit of searching, we have yet to find a substantive case where station dropout affects temp records at anything but a local scale.

    It can matter for data prior to 1950, if you don’t do your geographical weightings carefully.

  42. Eric Steig said

    Guys, nice to see that I can post here and get very reasonable and thoughtful responses. Thanks!

    Regarding this: If Eric Steig’s RC post accidentally selected all city centers (and I suspect it did), and his result matched CRU, we could reasonably conclude that UHI is a significant factor in CRU data — but how much would still be unknown.

    The point of our post wasn’t to address the UHI question, but only the question of whether there was any evidence whatsoever to conclude that there had been data manipulation. That results stands, regardless of the UHI question.

    Here’s the list of stations we used in that analysis. No intention of hiding that — it just wasn’t important for the point of the post. (And again, that point stands and I’ve seen no one even try to challenge it. I wish the public, the Republicans, the Democrats, everyone would get that point. The continued insinuations about all the bad things Phil Jones did with the data really annoy me. I mean, there is NO evidence guys….

    Anyway, enough ranting: here is the list. There might be a couple of extras which we didn’t use because they were too short for our quick analysis. Have at it if you like.



  43. steven Mosher said

    Thanks Dr Steig

    “The point of our post wasn’t to address the UHI question, but only the question of whether there was any evidence whatsoever to conclude that there had been data manipulation. That results stands, regardless of the UHI question.”

    I think after rereading your post we ( carrot, myself and others) came to the conclusion that you set out to prove that there was no data manipulation and that you had a test that showed that.

    As I understood your process, you selected stations from UCAR data ( raw) they were long lived stations 100 years +
    you then randomly selected stations from that list. You then looked for stations that had data in 1960-91 (26 years)

    You then pick the CORRESPONDINg stations from CRU. This point or proceedure wasnt very clear, which is why some of us asked for stations.

    You then divided this sample into two groups.. roughly 30 pairs per ground. You showed that

    1 a random selection of data from another dataset was no different from a “corresponding” dataset taken from CRU.

    The point was to show “no data manipulation.

    However, what the test actually shows is that the differences in the data are small. While I dont believe there is anything one could call data manipulation, I would argue that the test only rules out large differences between the datasets using a small sample size. it doesnt demonstrate ZERO manipulation. It can only rule out “differences” of a certain magnitude. That’s a fine logical distinction. am I clear in my understanding. Again, your point was to demonstrate that there was no discernable difference between CRU data and data from a difference source. You conclude from the lack of any large difference that there was no manipulation. Absent any significant difference, its prudent to assume that there was no “manipulation.” The charges of manipulation are thrown around rather loosely. From the auditing perspective I will say my goal hasnt been to find any manipulation, but rather to account for all the steps and decisions and see that “uncertainty” associated with those decisions is properly accounted for.

    Thanks for coming back

  44. steven Mosher said


    Can you decsribe your pairwise approach? did you do a paired station? I’m thinking of adding a function that allows you
    to select a parameter or factor (like/rural/urban) and then generate a pair list, of a RURAL station and its closest URBAN

    for example.. or any other factor..

  45. tonyb said

    Dr Steig

    I don’t want to distract you from your interesting conversation with Steven Mosher but I would appreciate some clarification on a couple of points. Firstly however can I thank you for your involvement here, and agree that it is useful to find a forum where this subject can be discussed without the rancour and point scoring that is often found elsewhere.

    I have a particular interest in pre 1850 instrumental records-here is my site

    There are also a variety of ‘anecdotal’ records that clearly tell us what the climate is doing that are also available from the site.

    I have two questions. Firstly, do you believe we are comparing like for like when we examine the records and make official pronouncements that are supposed to be accurate to fractions of a degree? I have tried to follow the peripatetic nature of a variety of stations and it is astonishing how often they can move which means that it is subsequently recording a completely different micro climate to the one that preceded it. I think micro climates are a neglected area, but basically it means we can end up comparing apples with oranges as each micro climate is by definition different to another and can respond differently to changing weather patterns and circumstances. For example a change in prevailing wind direction over a extended period can make even the anomaly calculation somewhat problematic as this may affect two sites differently.

    The most common change of course is that of it moving to a warmer location-typically an airport, or being engulfed by urban sprawl which the Romans recognised 2000 years ago would make a considerable difference to temperatures. Whilst this is a local -rather than global- effect, as many GISS stations are in an increasingly urbanised environment it will have a proportionately greater effect on that database.

    The second question relates to both these subjects. Whilst CET may not be a single station, it is nevertheless the longest and most examined data set in the world. The stations involved in the triangulation have moved and the urban adjustment made by Parker from 1974 has aroused some controversy. It therefore neatly encapsulates both the problems cited above over station moves and urbanisation.

    This article usefully sets out some of the discussion.

    I wonder if you have a view on this?


  46. I (and I suspect most scientists) am all for more effort in (and funding for) long term data monitoring. But I’d warn against making quick judgements/accusations of malfeasance, esp if different analyses come to the same conclusions, and comparsion of raw and adjusted data show nos sign of trends being affected. nevermind that satellite products also more or less agree.

    That said, my initial point, quoted by Jeff in the headpost, is important imho.

  47. steven Mosher said


    As one of the people who has spent two years trying to untangle what was done, I’ll say that there are really two groups of us. The first group is quick to latch onto any issue, any “mistake” any form of sloppiness and turn it into a charge of fraud. The second group is a class of person who loves data and who has a “thing” for the unglamorous job of collecting and organizing hoards of data. they tend to like things neat and organized and fully documented. As I trudge through this stuff, I see the typical signs of taking shortcuts here and there with stuff that doesnt matter. That’s annoying. Stupid stuff really. As former engineer I look at it and shake my head. I don’t see fraud. I do see potential trouble. I see a brittle system. Now to be sure, mistakes will happen (Y2K) and they get caught and corrected, but its all unecessary if good practices were in place to begin with. The people who run around and scream fraud about the land record are getting to be more way more annoying to me now. WRT data monitoring. There is also the fun task of collecting and rectifying old records. For some of us that is WAY more fun than R&D. go figure.

    Also, when we see a sentence like “more or less agree” that difference gets our attention.

    you see a trend of .15C decade in the satellite (for example) and one of .17C in the surface and you say

    ‘more or less agree.’ thats true.

    I see that same THING and I think; ” I wonder if I can find the “missing piece” of that puzzle.

    And you say: look 95% of the puzzle pieces are in place cant you see the big picture?
    I see the big picture. cool. now, about that missing puzzle piece. You see, I like that missing puzzle piece thing.

  48. Kenneth Fritsch said

    When Eric Steig @ Post #42 says:

    The point of our post wasn’t to address the UHI question, but only the question of whether there was any evidence whatsoever to conclude that there had been data manipulation. That results stands, regardless of the UHI question.

    I think we need to think in terms of what is the uncertainty of the three land based temperature data sets. Most thinking and skeptical people are not interested in whether the data has been “manipulated”, whatever that might mean, but rather with calculating some measure of the uncertainty in the data. On reading about current and past attempts to do this by the main stream climate scientists, I am struck by the apparent need to make some initial assumptions before attempting to calculate uncertainties.

    I think that more devastating than the charge of manipulation of the data, for which there is no evidence, is the sloppiness of the data owners and hesitation to correct past problems (laziness?) for which there is evidence. And, by the way, I would not consider someone who looks for errors and adjustments in one direction only as someone who is “manipulating” the data, but I would realize that it could cause biases to the final result.

    Judith Curry is the only consensus climate scientist, of which I am aware, who has suggested that the temperature data sets need to be analyzed by independent groups and not by their owners and supporters. When I hear a temperature set owner exclaim (hand wave) that the revealed errors make no difference and that they all cancel out (but the adjustments to their sets were primarily in one direction) I get mighty suspicious that that owner just might be making more of lawyerly defense than a science observation.

  49. Steve Hempell said

    Regarding he satellite data agreeing with the surface data.

    I drives me nuts when people draw a straight line trend through the surface and satellite data. The ultimate in this is the on the RSS webpage where they draw a straight line through the TLS data ( and state that the temperature has steadily decreased by -0.321 K/Decade. No it has not, as anyone can see by simply looking at the plot. The plot implies that the two volcanos have caused two step changes in the TLS temperatures and the time between and since the volcanos is essentially trendless.

    I am in the Bob Tisdale camp when it comes to the El Nino of 1998. He, and others, think that there was a step change in 1998. I therefore like to look at pre El Nino trend (Jan 1979 to Oct 1997) and post El Nino trend (Oct 1999 to present). Looking at the data with this POV, is where the surface temperature and satellite temperature trends part company. Especially UAH. I want to look more closely at this using Nick’s program (so I may change my mind after looking at the data). Also, I pointed out to William Briggs two years ago that there seemed to be only a small trend in UAH pre 1997 (~0.03 deg/dec) and he did an interesting post on his website ( RSS data.

    So I’m with Steve Mosher on this. The data gives rise to question after question. Why are the surface and satellite trends so different pre/post 1998 El Nino? Why are UAH and RSS different?
    Why,Why,Why – sound like a two year old.

    I too “wonder if I can find the “missing piece” of that puzzle.” I’m just less skilled at it than Steve M.

  50. Steve Hempell said


    Re #40

    I think I’ll try to do as you suggest, maybe in the next few days. What do I do – just e-mail any writeup to you?

  51. RomanM said

    Re: steven Mosher (May 12 13:57),

    no joy:

    Country: Canada
    Format: R data
    Data not available
    Division Level 1
    Division Level 2
    Division Level 3

    There is some 😉 data linked under “Level 1” (18.5MB), “Level2” (61 MB), and “Level3” (55.7 MB). These are R workspaces containing map information at various levels. The use of the sp and maptools libraries of R for manipulating the poly shapes is going to take some work to get through.

    When I was gathering some info on the mapping, I ran across a website which could prove useful for anyone who would like to dig into UHI effects. It contains population data on 590 cities which had populations of 750K or greater in 2005. The database contains the population for each of the cities starting in 1950 at 5 year intervals until the present along with projections of the population at 5 year jumps to 2025 with an added projection for the year 2050.

    I notice that Dr. Steig’s set A in #42 includes Rapid City. Even after looking at the USHCN data, I still haven’t figured out where the last 30 years or so of the Met Office data (presumably the same as the CRU version) comes from since it isn’t even close to all of the others.

  52. Steven Mosher,

    Thanks, that’s a helpful reply in order to understand where you and yours are coming from. You’ve got a point there.

    Good to read that you too are getting annoyed with the fraud!-yelling crowd. Perhaps you could be the Judith Curry for your side? (as Keith Kloor suggested someone should also take on that role on ‘the other side’)

  53. steven Mosher said


    ha Judith and I have talked at length. Close readers can see the difference she had on me, but there might be a case to be made for a more forthright statement, especially about the land record. In individual cases I’ve said some fairly strong things to the fraud mongers.

    maybe something where I lay out the concerns and which can be put to rest. I could make a list, some of the concerns were silly others ignorant, some conspiratorial ( dropping thermometers) some nitpicking, some substantive, none with the exception of the concerns over UHI challenges the science. And even in the case of UHI, its simply a matter of accuracy.
    ( that will get folks going)

  54. Kenneth Fritsch said

    I drives me nuts when people draw a straight line trend through the surface and satellite data. The ultimate in this is the on the RSS webpage where they draw a straight line through the TLS data ( and state that the temperature has steadily decreased by -0.321 K/Decade. No it has not, as anyone can see by simply looking at the plot. The plot implies that the two volcanos have caused two step changes in the TLS temperatures and the time between and since the volcanos is essentially trendless.

    I am in the Bob Tisdale camp when it comes to the El Nino of 1998. He, and others, think that there was a step change in 1998. I therefore like to look at pre El Nino trend (Jan 1979 to Oct 1997) and post El Nino trend (Oct 1999 to present).

    What drives you nuts is what was driven home to me when comparing temperature anomalies for stations within the same 5 X 5 grid. In cases where I could calculate the same trends in the data for pairwise comparisons, I could, at the same time, see very different coherence and breakpoints in the temperature series. That 1998 peak temperature does appear as a breakpoint in the data I have lookede at, although the analyses that I have seen in the past calculate, as I recall, 3 breakpoints in the global mean temperature anomaly over the past 130 years. Climate science as a whole does not appear to have much to say about change points. The recent Burger paper brought this to the fore in his coherence (and lack thereof)analysis of historical temperature proxies and reconstructions.

    I also know people are driven nuts when statistically derived change points are shown with a temperature series.

  55. Kenneth Fritsch said

    I intended to note in my previous comment that the 1998 break is not included in the 3 breaks that are normally calculated. That was with older data and perhaps with more recent points on the graph that year would provide a statistical breakpoint. Unfortunately there are several parameters in caqlculating breakpoints including the confidence limits and the segment lengths.

  56. intrepid_wanders said

    Reply Steven#53

    As you said, “that will get folks going”;

    I have worked with measuring equipment for more that 15 years (semi-conductors; microns and carrier concentration) and from a statistical format, if there is too high of variation or no correlation between metrology devices, and it is removed from service WITHOUT a comment, I am sure in any “process” this would cause excitement. I understand this ISO/QS behavior may not something that exercised in the academic world, but WOW.

    The reason for procedure and journals is to prevent the “fraud” charge from crossing the lips. Engineers do it. Accountants do it. Programmers do it…

    Until a self correction is performed, this will always remain political, because without real documentation or journals to add in understanding of these WTF’s, the only understanding IS fraud.

  57. steven Mosher said


    “Until a self correction is performed, this will always remain political, because without real documentation or journals to add in understanding of these WTF’s, the only understanding IS fraud.”

    That’s false. There are many understandings besides fraud. My sense having spent many years trudging through OPC and OPD ( other peoples code and other peoples data ) is this.

    The shortcomings I see are typical of the short comings I’ve seen in other code and dat sets produced by scientists. This is not a slam, just an observation. As an engineer you are trained to make things that survive your absence. Good bosses
    demand that. your work is typically passed onto to others to maintain and service. Thats why we measure things like
    MAINTAINABILITY. but in the scientific world, there is no feedback for poor maintainability. When I read Jones write about his own data and his own code, I had a pretty good feel for the kind of worker he was. he later and others later agreed that he wasnt the best record keeper. Document control is important to us because we are REWARDED for good doc control and PUNISHED for bad doc control. It wasnt important to Jones. That doesnt make him a fraud. people are sloppy, lazy, ill informed, under pressure, untrained, many REASONS beside fraud. Absent any DIRECT EVIDENCE of the intentional creation of data known to be false, I think its indefensible and SLOPPY to charge fraud. sloppy thinking. negative points for u.

    Further some of the problems pre date the scientists involved. Until you personaly read the code and work with that data and read all the mails, I’d suggest that you are speaking with no facts on your side. You dont add to our understanding.
    for example, the vast majority of this data is HISTORICAL. there is no ISO in 1929.

  58. AMac said

    Re: steven Mosher (May 14 00:23),

    Viewing things from the scientific side of the science/engineering cultural divide you describe, I think your insight accounts for quite a lot of the problems that bedevil climate science.

    (Actually, I don’t think “climate science” is a proper description for the field in this context. It implicitly assumes the perspective that certain people hold. Namely, that AGW Consensus procedures and conclusions have been arrived at “scientifically,” with the corrollary that all those who contest the AGW Consensus are “unscientific” in motivation and practice. The Gleick Letter is the latest example.)

    My own experiences might be illustrative.

    As a bench-science trainee (grad student) and then as an academic scientist (fellow), it was rare for anyone to look at my lab notebook. I might choose to open it to a page and show that to a colleague or the head of the lab in explaining a result, or brainstorming, or troubleshooting. That’s quite different from somebody going through it, page by page, for a purpose of their own.

    In readying material for presentation (e.g. lab meeting, or poster, or manuscript), it was up to me to work up the data. Compile readings, choose images, perform statistics, compose summary tables and charts.

    I never committed fraud. To my knowledge, neither did any of my colleagues, or anybody connected with the labs I have been in. This situation is one where a bunch of very smart people are paying a lot of attention to very-closely-related work being done a few feet away, informally discussed every day, and formally presented a couple of times a month. Typical for an academic lab. Fraud isn’t impossible–obviously–but it’s a very difficult feat to pull off. In a meaningful way, e.g., for the purpose of advancing one’s career. Anyone who believes otherwise, that fraud is easy or common in academic science, should think about it more, with an open mind.

    The situation was different when I moved to industry. There, the hope was that my benchwork would be “translated.” That it would lead to insights for new products, or to improved understanding of current processes, or to improved processes, or to patent filings. Culturally, employees understood that the work wasn’t “theirs,” but the company’s. We also understood that none of us had made a lifetime committment to work at this startup, and that any of us might move to a different project.

    All of this led to a very different attitude about recordkeeping for experiments. Everyone doing benchwork was required to use only bound notebooks, and to date and sign every page. We had to bring our records to our supervisor to have him or her countersign and date every page. As a manager, that became one of my responsibilities for those I supervised. The countersigning person had to understand what the scientist had written–which often required the addition of explanatory notes (clearly dated).

    I expect that many of the engineers reading this are somewhat appalled at the slack way that I kept records of my work, during my time in academic biomedicine.

    Per Steven Mosher, I think this is close to S.O.P., in most or perhaps all fields. There’s nobody assigned the responsibility for reviewing primary records, page-py-page. There’s no reward for doing so (monetary, career advancement, prestige). How complete and comprehensible notebooks and other records are, will largely be a function of personality: that of the lab’s Principal Investigator, and that of the bench scientist.

    I would guess that the situation over the years at CRU or Penn State was in many respects closer to my experience in academia, than to what an engineer would expect due to his or her training.

    To an academic scientist — within climate science, or in a different area — this may not be ideal, but it is familiar, and acceptable.

    The interests, problems, and opportunities in climate science haven’t changed. What’s different are the stakes, with respect to public policy. It’s “unfair” of critics to single out climatology, and retroactively impose engineering-style expectations of traceablility and accountability. Understandable if regrettable that these sorts of criticisms are interpreted by many academics as an “attack on science.”

  59. a reader said

    Smithsonian Miscellaneous Collections Vol.90 containing weather records 1921-1930 is now available online at It joins the earliest vol. 79 which contains earliest obs.-1920. Hopefully they will soon scan in vol. 105 as well.

  60. Layman Lurker said

    #57 and #58

    Hopefully, with all that has gone on in climate science, there will be recognition that “process” orientation would have prevented most of the circumstances which led to controversies.

    Institutions, rather than individuals, are the key to success. Independent auditing is critical. In many cases, the protocols for docmentation and quality control are in place, but there is a lack of follow through. As a case in point, I think back to the discussions at CA of how NSF administered Kaufman 09 – a lot of dangling gaps and questions between the original terms of the grant and the final product.

  61. stan said

    57 Steve Mosher,

    I don’t think he means fraud from the outset. I think the case can be made that Jones clearly misrepresented the quality of his records and his databases. Over the years, he’d constructed a big pile of crap. Then he became a hugely important, world famous scientist and had to hide the shoddy quality.

  62. Chuckles said

    #57 Amac,

    As always you make some excellent points, but the points you make suggest to me that Academia is the problem rather than there necessarily being a problem with ‘science’, or that it is a science/engineering divide.

    Peer review etc are academic affectations as far as I’m concerned, and I’d place much more faith in a well written industry style report. Similarly, farm the research out on commercial contracts rather than research grants. Hopefully\that would get rid of most of the prima donnas and faux outrage floating around.

    On the world temperature facet, I see the usual huffing and puffing about this data set or that and this technique or that.
    How about starting at the beginning, by defining and writing down why we need this temperature and what it is supposed to represent? From there we could possibly move on to how it would be constructed, acquired, measured , verified etc., with statisticians, scientists, engineers all throwing in the 2c worth as appropriate.
    i.e. develop a standards specification document for what is needed, rather than ethereal data manipulation techniques on random data.

  63. steven Mosher said


    “All of this led to a very different attitude about recordkeeping for experiments. Everyone doing benchwork was required to use only bound notebooks, and to date and sign every page. We had to bring our records to our supervisor to have him or her countersign and date every page. As a manager, that became one of my responsibilities for those I supervised. The countersigning person had to understand what the scientist had written–which often required the addition of explanatory notes (clearly dated)”

    That was my experience. One of the things driving that accountability was patents. The other was the ever present threat of audit. The work was not our work. It was work for the company. Even within a company you could see differences between
    a R&D function and a production function. The R&D guys just made stuff that worked and then they moved on to the next problem. Sheer drudgery cleaning up after them. At one point I thought it a smart idea to put an R&D coder with a production coder on a project. Fist fight. Get it done versus follow the proceedure.

    That clash of styles CAN be fruitful, if well managed. But when people like me walk into a mess of code and data we see what we have seen many times before. Its old ground. Now, I never saw a fraud. What I did see was this.

    After finding errors, sometime substantial errors, there was HUGE resistance to fix the error OPENLY. Mostly people
    wanted to deny that the error matter ( but fix it) or they wanted to fix the error and make no notice. “we’ve got
    5 years of studies that used that data, we cant just do that work over, fix it and move forward.” The practice of
    “regression testing” was resisisted.

    In the science, I’ve tried to argue for reproduceable results. When errors are found, fix them, give credit and recompile
    the science.

    Whew. anyway, made some progress today on Antarctic online data. A consistent use of a standard value for “missing values” would have been nice. and easy. But alas, it takes a bunch of code to read the crap. One small change on their part would make my part easier. But they dont have a customer. monads.

  64. steven Mosher said


    They have no experience with Requirements docs. No customer.

  65. steven Mosher said

    Re 61.

    Jones didnt hide the fact very well. in 2002 when Mcintyre asked for the data Jones replied that he “had it somewhere on a diskette.”

    I read that on steve’s blog long before I read the mails. That was a huge clue.

    In my experience there were two kinds of guys.

    GUY 1. you ask him for his data. he sends you a memo that says “Avoid verbal orders” and a request form.
    you fill out the request form and he gets you a disk. numbered, labelled, all recorded. He is not an imaginative
    guy. his desk is neat and clean.

    GUY2: you ask him for his data. he opens a drawer, a jumble of disks. he sorts through them.. ” hey look what I found”
    what? the data? No, my old wedding picture.. hmm its here somewhere.. looks under a stack of papers.. BINGO
    here it is.. the purple one. its not labelled. he hands it to you. “bring it back when you are done.” dont you want to
    give me a copy? err whatever, if I need it back I’ll come and find you. smart, scatterbrained, has stacks of shit.
    organizes spatially rather than linguistically. he is wearing two different colored socks

    thats a cartoon of humans but like all cartoons has some value

  66. tonyb said

    Steven Mosher

    A journalist visited James Hansens office and described it as ‘comically cluttered.’

    He subsequently phoned to say it was much better than it used to be. I don’t think his figures are fraudulent, just very confused which is likely to lead to mistakes.


  67. Chuckles said

    #64 Mosh,

    I know, I’m saying that I’d really like to see this thread doing something like fleshing out the basics of a requirements doc though.

  68. JR said

    Re: #59 – A reader – Nice find. Thanks for the link. They do have volume 105 online – search for “smithsonian 105”. They also have WWR 1941-1950, and WWR Europe for 51-60, 61-70, and 71-80. They have other WWR volumes from 1951-on but I haven’t looked to see if everything is there yet.

  69. Geoff Sherrington said

    45 Tonyb

    While you have a compelling case for a gradual upward drift in long term temperatures, the focus is perhaps on the recent blip at the end.

    One can make a case that there is essentially no global temperature change since 1900. It is not a confident case, but an example is shown on David Stockwell’s “Niche Modeling” blog at

    I summary, there was discussion of choices of an ARIMA fit to historic data, then a residual error graph.

    To save browsing time, note where I write 3 weeks ago “Meanwhile, in the land of adjusted temperatures, which method gives the best fit to this graph, which I suspect that most of us use mentally, after having read paper after paper about adjustments?

    Create by:

    (a) adding 0.2 deg C to all temperatures before 1950;
    (b) leaving 1951 where it is;
    (c) de-stretching the slope from 1951 to present so the right end hits right now at 0.3 deg C cooler than shown on the parent graph.”


    “Oh heck, I’ve just realised that my adjusted graph is just like the residual error from ARIMA(3,1,0).

    I wonder what this means? Why not subtract the ARIMA residual error from my 2-step adjusted graph to see if we have near enough to a straight, horizontal line?”

  70. DeWitt Payne said

    Re: Geoff Sherrington (May 14 21:24),

    The problem with using ARIMA(3,1,0) is that the 1 may not be 1, but less than 1. In that case the error range expands far less rapidly. There is good reason to believe that temperature is only near unit root (the bucket leaks). See my comment here for example.

  71. tonyb said

    Geoff #69

    Your photobucket link returnmed an error. Could you check and repost? Thanks


  72. a reader said


    Be sure to read the station notes, errata pages, and editorial notes as well as the records themselves. You will find station moves, how means were figured, who did the harmonizing of records, all sorts of neat stuff.

    I mentioned at ClimateAudit a couple of times that I thought that some of the errors in records could be traced back to these books, possibly by people not applying the corrections when new books came out, and thus ending up with multiple records for the same location. The early books at least did not have WMO numbers for sites. I thought some of the step changes that occur at decadal changes might correlate with the decadal publishing of these books. When the records were digitized, the chances for mistakes would have been possible. I’ve never looked into it though, so it may all just be a hunch.

    If I remember correctly from looking at Jones’ early papers, he said that he used these books.

  73. BeekAdold said

    Willkommen im Erotikchat.

    Dieser Erotikchat gibt dir eine Möglichkeit kostenlos nackte frauensexy und jedemenge andere Sachen,unter anderem Aufregend chatten
    Hier in unsrem Erotikchat findest du kostenlos nackte frauensexy Sexgeschichten
    Du suchst Sexgeschichten , dann bist du hier genau richtig.Gut,worauf wartest du?
    Flirt und Sextalk mann sucht mann ,sofort anmelden .
    Du suchst jemand aus Winterthur, vieleicht in Erfurt, in Deutschland , vieleicht von Aargau, oder in SanktAndrä? Sicher kein Problem.!

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: