17 thoughts on “Refusing Data

    Nov. 29, 2009
    “Climate change data dumped.”

    SCIENTISTS at the University of East Anglia (UEA) have admitted throwing away much of the raw temperature data on which their predictions of global warming are based.

    It means that other academics are not able to check basic calculations said to show a long-term rise in temperature over the past 150 years.

    The UEA’s Climatic Research Unit (CRU) was forced to reveal the loss following requests for the data under Freedom of Information legislation.

    The data were gathered from weather stations around the world and then adjusted to take account of variables in the way they were collected. The revised figures were kept, but the originals — stored on paper and magnetic tape — were dumped to save space when the CRU moved to a new building.

    The admission follows the leaking of a thousand private emails sent and received by Professor Phil Jones, the CRU’s director. In them he discusses thwarting climate sceptics seeking access to such data.

    In a statement on its website, the CRU said: “We do not hold the original raw data but only the value-added (quality controlled and homogenised) data.”

    The CRU is the world’s leading centre for reconstructing past climate and temperatures. Climate change sceptics have long been keen to examine exactly how its data were compiled. That is now impossible.

    Roger Pielke, professor of environmental studies at Colorado University, discovered data had been lost when he asked for original records. “The CRU is basically saying, ‘Trust us’. So much for settling questions and resolving debates with science,” he said.

    Jones was not in charge of the CRU when the data were thrown away in the 1980s, a time when climate change was seen as a less pressing issue. The lost material was used to build the databases that have been his life’s work, showing how the world has warmed by 0.8C over the past 157 years.

    He and his colleagues say this temperature rise is “unequivocally” linked to greenhouse gas emissions generated by humans. Their findings are one of the main pieces of evidence used by the Intergovernmental Panel on Climate Change, which says global warming is a threat to humanity.

    (end clip)

    So only homogenized data are necessary for other people to look at. Raw data are irrelevant. No one can reconstruct the studies now. “We threw that stuff out.” Wonder how much real raw data still exist at other research centers. It might all be gone. Worth querying.

  2. If the original data was really thrown out in the 1980’s, they would have known this from the work constructing the first two versions of HADCRUT. So why was this admission not the response to the first request and subsequent requests for information?

  3. Peter—

    Yes. Something is going on here that’s major. Did they really throw out all the raw data? If so, how many other research centers have, too?

    This could be big.

    Somebody needs to query those other centers. I’m not versed in the lingo, but if no one else will do it, and you all let me know exactly what questions to ask, and which centers to ask them of, i’ll make a few calls and see what happens.

  4. Let me try this one more time. I’m a free-lance medical reporter. I do guest shots on radio. One of those shows has millions of listeners. If UEA CRU has destroyed all their raw data that supports their hypothesis on man-made warming, this is MAJOR, and someone needs to query the press desks by phone at the other major climate research centers. I’ll make the calls. Just give me the names of the key centers and their numbers if you have them. I’ll ask them if they still have all their raw data, and if so, will they release the package for independent inspection.

    Isn’t this a logical step?

  5. #4 You are asking us something we cannot answer. One of the pieces of data missing is which centers were used for the creation of the datasets. They currently make the point that GHCN data are mostly used so Jones says start there. Unfortunately, this is a very large dataset where different stations have different problems and all stations have corrections. The raw data is available but I’m sure not all stations are used. If say 20% of the data were not used, which 20% and why becomes the question. Also, if 95% of the data is GHCN what is the other 5% and how does it weight the more sensitive historic record.

  6. Jon,

    Here is an emial subset from Jones.

    Almost all the data we have in the CRU archive is exactly the same
    > as in the Global Historical Climatology Network (GHCN) archive used
    > by the NOAA National Climatic Data Center [see here
    > and
    > here ].
    > The original raw data are not “lost.” I could reconstruct what we
    > had from U.S. Department of Energy reports we published in the
    > mid-1980s. I would start with the GHCN data. I know that the effort
    > would be a complete waste of time, though.

    but Jones won’t provide even a list of the stations used.

  7. Jeff—

    I don’t think anyone knows what the hell Jones is really saying. How can he reconstruct the raw data from altered homogenized data? That’s like saying I can reconstruct the box score of a baseball game from an article about it in the newspaper. He’s going to infer millions of pieces of raw information from an altered selection?

    But okay, I think I get your point. So the burning question can be shrunk down to, WHERE IS ALL THE RAW DATA? Maybe I’m wrong, but I believe the question can be asked. It can not only be asked, it can be shoved down their throats for a few months.


    In other words, keep pushing the basic fraud angle.

    I don’t understand the complexities of the research, but I do know a story when I see it.


    That story I can understand. I realize there are multiple frauds here, and many of them are about models and projections and selective use of tree data and other very sophisticated and deceptive bullshit stories that are way beyond my pay grade. But the basic deal about raw data and who has the burden to come up with it—I think this big. I think it can be pressed.

    As a very loose analogy—I want to find out if a horse race has been fixed. I hear about different kinds of shoes and how to measure them and which ones are okay and which ones are illegal, and I hear about photos which, from different angles, show the jockey steered another horse toward the rail, but the photos, some say, are deceptive (they’re lying), and so forth and so on—but if I can find out the horse got an injection of speed five minutes before the race, that’s the one I’m going after.

    Believe me, I’m not saying the intricacies of the high-level science are unimportant. The more frauds exposed the better. I’m just saying this basic one could have legs.

    Any thoughts?

  8. #7, I prefer the question, which data was used and for which years and where can we get a copy of both the raw and corrected data. I mean if he takes the bullcrap hide in the sand approach that he doesn’t think scientists should save RAW data, provide access to the adjusted data and the metadata indicating where the adjusted data came from.

    There is a big story in this and WUWT may be the group to break it with the surfacestations project. I beleive all the primary ground networks are getting similar results now because it’s almost the same data (different stations chosen for different reasons) with the exact same corrections – i.e. GHCN. If that is the case, the argument that these datasets are reliable because they confirm each other is flat bogus. We really won’t know until CRU comes clean with a true list by year of station data used.

  9. The AGW scientists in NZ claimed that they cook the books the same way all the other AGW scientists do. Since they all make the same adjustments, this must prove that all of their adjustments are correct and the resulting data is 100% correct. Right?!
    (yeah, right)

  10. #8 As I posted at Lucia’s, the publicly accessible GHCN data is not raw data, either. Adjustments have been made to the GHCN data – adjustments that can no longer be verified.

  11. Ryan,

    from the GHCN readme on the data they state the following:

    All files except this one were compressed with a standard UNIX compression.
    To uncompress the files, most operating systems will respond to:
    “uncompress filename.Z”, after which, the file is larger and the .Z ending is
    removed. Because the compressed files are binary, the file transfer
    protocol may have to be set to binary prior to downloading (in ftp, type bin).

    The three raw data files are:

    The versions of these data sets that have data which we adjusted
    to account for various non-climatic inhomogeneities are:

    I’ve not done very much with this data yet but plan to in the next few weeks. Are you saying these are adjusted too?

  12. Yes. The series in v2.mean, v2.max, and v2.min are actually (in many cases) composites. For example, go to the GISTEMP station selector, select “raw GHCN data”, and search for “Olenek”. It’s a Siberian station. You’ll see ID numbers 222241250000, 222241250001, 222241250002, 222241250003, 222241250004, and 222241250005. All of those represent different instruments. Click on any one of them and it will show you a graph with past temperatures in black lines and whatever was decided as the current primary station as blue squares and lines.

    Those black lines don’t even represent the raw data. I believe they are after adjusting the individual series to have similar means during the overlap period. While I could be mistaken, the curves match far too closely to be the actual raw data (in my opinion). I forget if the 1998 Peterson paper says what they are. It might. Anyway, depending on the station, some are from different elevations, some from airports vs. rural, and so forth. But they always match far too closely in my opinion to be the actual raw data (just look at the raw New Zealand data).

    Now here’s the rub. You can’t get the data for 222241250004 from GHCN, for example. You can only get the combined series. Go ahead and look in your daily data . . . there are only combined series.

    I just so happen to know that the current reporting for Olenek (which is not used in GISS, for some reason) happens to be at a small local airport. That airport didn’t exist before 1980 (or thereabouts). So where was the thermometer before that? Who knows. The metadata for lots of these stations is very sparse.

    I contacted NOAA about this and requested the metadata and original 22224125000 – 222241250005 series so I could see for myself how they were combined. They told me the have neither the metadata nor the individual series. They suggested I contact the Russian meteorological agency. I did that, and the agency said they do not have the data, either and did not know where to obtain it.

    The frustrating part is that it’s clear that NOAA had the data at one time and that they digitized it. Could it really be lost? I doubt it. But then I got involved in Antarctica and just let the GHCN thing drop.

    The reason it is important is because you can’t get the GHCN monthly values from the GHCN daily values for many of the Siberian stations. I found differences exceeding 2 degrees C in some cases. Without the original data, I have no way to figure out why this might be. I did get far enough to determine that, for the 28 stations I was looking at, the mean trend calculated from the monthly values was 0.3 deg C/century higher than when you calculated monthlies from the daily min/max and trended that result.

  13. #12, Thanks Ryan.

    I did get far enough to determine that, for the 28 stations I was looking at, the mean trend calculated from the monthly values was 0.3 deg C/century higher than when you calculated monthlies from the daily min/max and trended that result.

    You keep this up you’ll loose your denialist union card.

  14. I wonder what would happen if:

    Instead of relying on raw temperatures, or even adjusted raw temperatures or anomalies, the individual stationary stations were used to determine local temperate rates-of-change prior to being assembled into a global product.

    This should sidestep half of the adjustments right out of the box.

    SHAP: A station move is just two separate instruments – one prior to move-date, one post. There is no need to adjust, just to split appropriately.

    MMTS: You’ve switched instruments. Don’t adjust the raw data, just acknowledge the switch date and split the record.

    TOBS: You’ve switched methods. Split the record.

    Urbanization is tougher, but the current method doesn’t focus on finding the “true gridcell temperature” either.

  15. So Realclimate’s page is equivalent to saying “here is a dictionary with all the words you need” to justify censorship of a book.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s