the Air Vent

Because the world needs another opinion


Posted by Jeff Id on December 22, 2009

No politics here. Personally, I’m so sick of politics and climategate, you wouldn’t believe it. I like numbers, science and math, not this constant idiocy. There is more of it to come though. People don’t realize, just how much shit you can pack into several thousand emails. – A thousand by topic, several by individual.

This post is the beginning of an investigation into the Siberian temperature data. Phil Jones personally took the time to kill two Siberia papers about CRU and the discrepency between thermometers and the primary ‘temperature’ dataset in publication. The CRU data may be exactly right, but the zeal with which these papers were killed, gives us a clue that there may be a problem. I’ve looked no further ahead than demonstrated here, so be warned we may prove Dr. Jones right.

First, this is the region in quesiton.

The Siberian station locations from the GHCN network are plotted below. These stations represent a small subgroup of the continent (and the planet). The region is from 50-65 latitude and 100-120 longitude as defined by Lars Kamél in his criticism of CRUtem. The paper was apparently blocked from publication through intense efforts by Phil Climategate Jones and only now the paper is presented at CA.

Recently rejected two papers (one for JGR and for GRL) from people saying CRU has it wrong over Siberia. Went to town in both reviews, hopefully successfully. If either appears I will be very surprised, but you never know with GRL.

Lars paper was simple and apparently used an incredibly small set of temperature stations, which if I were writing a paper would make me nervous. I’ve got no inside info though so we’ll see what we find together. The dots below indicate the 26 locations from which data exists in the GHCN network.

Figure 2: Siberian temperature station locatoins. It's a little hard to see but you're looking at Siberia a the top, Japan at the lower right and Mongolia in the lower center.

Well for the non-regulars, I prefer to look first at the original data in a project, before doing anything. In the case of climate temperatures, it means the non-homogenized raw…ish data. By the time we see it, there have been multiple, non-standard QC steps which correct for several factors including time of day, instrument type and other things. To be clear, this post looks only at the raw GHCN temp data, which I have anomalized to remove monthly seasonal temp variation. i,e, average all January’s, Feb’s…. and divide.

As a personal pet peeve, the time of day corrections to raw data are particularly contentious. This is a spot where trends of a few tenths of a degree trend over a century could be accidentally added very easily. Better documentation of adjustments and the reasoning is required in the primary metadata rather than in an obscure paper which may or may not exist. In this series of Siberia posts, we’ll ignore that detail. The raw anomaly data is shown below.

Ok, so above is all the GHCN data available for Siberia. There are a few stations with strong uptrends, some stations with very minimal uptrend and a few with a minimal downtrend.

The average of the stations is shown below. This average is not weighted by area, and this post is only a beginning. I still can’t help but point out that the 20th century warming is supposedly the fastest and most extreme event in thousands of years.

Now doctor Phil and CRU may yet be right. This doesn’t disprove a massive 4ish C per century trend, but by looking at the rawish data, it’s a really tough result to imagine.

For the 4C trend, see the big red area in the top center below.

41 Responses to “Siberia”

  1. […] Jeff Id Empieza a investigar Siberia en The Air Vent: Siberia […]

  2. Greg said

    “No politics here. Personally, I’m so sick of politics and climategate, you wouldn’t believe it. I like numbers, science and math, not this constant idiocy. There is more of it to come though. People don’t realize, just how much shit you can pack into several thousand emails.”

    Much as I enjoy the political part I do have to agree that it is extremely counter-productive when you’re trying to pick out the facts of the situation. It also makes it harder to identify science that is merely sloppy from that which is biased due to self-interest (they provide the grants so we should make them happy) and then also decide which is actually fraud.

    Love your blog, keep it up!

  3. Peter said

    Jeff, slightly OT, but do you know what CRU does with the poles, if anything? If they smear temperature at the edges over the pole, could Siberia be doing double duty?

  4. PaulM said

    But so far you have not compensated for the fact that some of these stations (in fact mostly the ones with the steepest rise) are in cities, like 30309 which sticks out in the first set is Bratsk, pop 1/4 M. Similarly Chita 30758 is a big city.

    So what happens if you take the subset of your 26 that are rural stations, and plot the average of those?

  5. PaulM said

    30823 – Ulan Ude, pop over 1/4 M
    30710 – Irkutsk, pop over 1/2 M

    I think these are probably the 4 biggest cities on your list.
    What’s the trend of the remaining 22?

  6. Peter said

    yes, it’s very odd indeed that it’s up to blog sites such as this one to distinguish fact from fiction in the works of climate scientists. Eventually, this has to come to a point where the list of falsehoods peddled by them will be so long it must be resolved in a court of law to bring them to account. Then and only then can the debate be finally resolved.

  7. Olympus Mons said

    Can anyone tell me how I can get raw data to Lisbon, portugal?

  8. JohnH said

    The MET office is spinning a different slant on their data, they are saying that the IEA and the HadCrut correlate well but both badly underestimate the warming shown by their new study which uses lots more data eg extra stations and other feeds. They conveniently do not supply any refs to released peer reviewed papers and no data, just one of their nice graphics. There is no way of checking their claims.

    The outfit that came up with this new study is the ECMWF (the European Centre for Medium-Range Weather Forecasts) with input from the Met Office

  9. BarryW said

    Your chart says that’s .033 per Decade which only gives .33 per century not 4 deg. Wrong units?

  10. dearieme said

    I sit here looking out at a 19th Century “Dickensian winter” and it occurs to me that all this has become a battle between 19th- Century-style “Gentlemen Scientists” on the blogs, motivated by intellectual curiosity, and 20th-Century-style Professional Scientists, motivated as a Profession – that is, as Adam Smith remarked, as a conspiracy against the layman.

  11. Jeff Id said

    #9 Nope, the units are correct, that’s the answer I’m getting. The 4 C is from the CRU plot at the bottom. The CRU plot is from 61-90 which has a greater upslope in the chart.

    #4,5 Paul,

    That’s exactly what I want to do. There aren’t so many datasets here that we need computers to go through them and sort. We don’t need peerreview to have he common sense to go through the thermometer data ourselves and figure out what the actual data says. After all, most of us have been reading thermometers since we were kids.

  12. PaulM said

    # 11, great – I look forward to seeing the next installment.

    You probably have this, but there is useful info about locations and station moves at

    Also the first of these has a link to raw (?) temperature data. It might be interesting to see how this compares with the raw (?) GHCN data that you are using. [Just looking at the numbers – at some of these places it’s +20 in summer and -25 in winter, thats a difference of 45 degrees. Kind of puts 1 degree per century in context!]

  13. Mesa said

    Well, the last chart presented here certainly shows a very strong warming trend for the last 40 yrs….possibly like 4 c/century…..(note the units on the charts are in full degrees C, ie full scale is 5 C not .5 C). So it really depends on the time period chosen. Obviously this is a high variability climate region as others have noted.

  14. RobertM said

    Think you have just confirmed MET office results. Eyeballing the MET office chart and your site map indicates you are looking into a region (100-120E 50-60N i.e. just north of Mongolia). The MET charts shows blues and oranges in this region so this region has had slight cooling or at least not much warming so it looks like your result is not inconsistant with them.

    The “red hot” region is at about 60E 60N. Perhaps you can do the same analysis for this region.

    Does any one have a larger resolution version of the MET chart and is there an equivalent GISS one ? Would be nice to compare the two.

  15. Jeff Id said

    #14, This analysis will keep expanding, the region was chosen because there’s a paper on it that was rejected claiming error in the CRU data. If there is an error, it will be interesting to find, if not then we know not every region has their foot on the scale like the Darwin Australian station. I’d like to throw the gridded CRU into the mix next and see if it matches, also, I’m very curious if urban warming can be seen in the data as Paul says above. Then we need to check if the ‘adjusted’ data makes any sense. One or two posts. After that, we can move on to expand the region to cover all of Russia.

    This is going to turn into a big project, I will be doing area weighted versions and some QC analysis of the data based on what we learn. Curiosity, drives most of the science posts here.

  16. kdk33 said

    It seems to me that the station data is either good and we should use it, or it’s bad and we should throw it out – I’m very skeptical about all the corrections, no matter how well intentioned. I don’t see the point in trying to calculate a global parameter – usually some kind of weighted average. So, I’d like to see a surface station exercise that uses only high quality stations and doesn’t try to convolute the data in any way.

    I don’t know the complete list of quality criteria, but a good start might be: a long unbroken history, a rural setting (away from UHI), no (or only minor) relocations, some kind of data screen to look for discontinuities.

    Why average? The theory is global warming. I don’t think you can hide from it. It’s either warming up or it isn’t. Let’s just look at the behavior of each of the quality stations. Are they all warming? Only some? When? Only in certain areas? All at the same time? Different times in different places? Seems like we could answer these kinds of questions without calculating complicated averages.

    Maybe this has already been done…

  17. Roger A said


    Nice report…. but… If you look at the CRU chart, the area where your sites are located in Siberia actually has a negative temperature slope. The ‘big red spot’ is somewhere to the east of where your temprature data comes from. Just curious if you noted that. It seems that the CRU chart is not too far off from what you found .. at the location wehre you selected your sites.

    I’m really a big fan, keep up the great work here.


  18. Roger A said

    OOPS! Big red spot is WEST of your sites



  19. Kenneth Fritsch said

    Jeff ID, you have plotted a trend for the period as long as the range of the stations that reported with only one covering the earliest part of your plot and the second going back to 1850 or so. I did a similar plot using CRU data from KNMI and found if you go back to the 1850s the trend is much smaller for the Russian region. But come on when you do that you have to emphasize the dearth of data going back that far and the uncertainty that that implies.

    Look at the color coded surface temperature map you show here and the great differences in temperature trends and see what that implies when you you use only a few stations.

    What I see is that going back in time we have much uncertainty in temperatures and more than I judge the major temperature set owners are willing to admit. That the Russian region is in a warming period over the past 3 decades is further in evidence by the agreement of the UAH and CRU temperature trends for the pweriod 1979-2009. Actually GISS gives a statistically significant larger trend for this period and it is that difference that could put some doubt in what GISS is doing with the data.

    Could it have been warmer going back into the 1800s in the Russia region? I think the uncertainty arising from the lack of quality readings with sufficient geographical coverage makes that call a difficult one to make.

    Jeff, I think you were remiss in not showing the trend for 1900-2009 (or 1990) and 1979-2009.

  20. Jeff Id said

    Roger, you’re right about that. I’m going to keep expanding this after comparing these results to CRU. CRU may be too low in this area and too high in another so they balance. I really don’t know what to expect.

  21. Jeff Id said

    #19 Always the critic. I finished at 1am so it was not remiss but exhaustion, try and be patient I’m a busy guy.

  22. Karl L said

    Hi Jeff,

    You said you anomalized by averaging all of the station readings. I noticed in the Hadley CRU climate data release they ran their normals from 1961-1990, or at least in most cases, and in the jones98 files they used 1951-1970… I have just gotten the data into R and am just starting to poke around. Soon I’ll be able to see for myself but do you know what the impact of changing the normalization period has on the plots?

  23. Jeff Id said

    #22 It’s mostly just an offset, but I also wonder how much annual signal is still in the curve. CRU picked a fixed range so historic values don’t change as the data is updated. I’m not sure the best way but perhaps we should match their work for comparison.

    I’ll post the code up later so you can see what it’s all coming from.

  24. Tom Fuller said

    You’re doing great–I’d keep emphasizing the ‘work in progress’ part to avoid confusion.

  25. Jeff;

    I borrowed one of your graphs for use as an attached image on the Sean Hannity Forums. I did link back to here and did not carry any of your work over. Hope you don’t mind.

  26. Jeff Id said

    #24, thanks Tom.

    #25, Feel free to use anything you want.

  27. PeterS said

    #16 kdk33, I’d go one step further and remove all readings taken from cities and airports as they are obviously contaminated by the urban heat island effect. No amount of correction can really get to a true temperature reading of the climate. We should be using only readings from remote areas where the surroundings have not been altered by man – only nature, except not around naturally hot areas like active volcanoes and hot springs. As an analogy, if one were to measure the temperature of a person, you wouldn’t place the thermometer between the two fingers that a person is holding a lighted cigarette, then use statistical methods to work out the body temperature.

  28. […] alarmist community. Even Australia’s PM is now being called out for fraud. And yet the data fraud continues unabated in the halls of climate science, with Gavin Schmidt repeatedly being exposed as another […]

  29. I’ve started looking into the Siberian temperatures as a result of the Kamel paper as well. The first thing you’ll find is that even an individual station is not as it seems – Kirensk in the CRU data is significantly colder than GHCN raw before about 1950, though they are nearly identical afterwards. Check it out .

    Also I downloaded the daily records for GHCN and they tell an interesting story, too. The data for Kirensk and I believe many others in the area is very sporadic after about 1999 – though GHCN and CRU have regular monthly averages for the full period as if it were reporting regularly, even though many months and even years have no actual daily readings for the station. No doubt in those cases they just fill in estimates from neighbours, so if you’re trying compare neighbouring stations I’m sure you’ll find they match very well in a lot of cases, seeing as they’ve been filled in from each other.

  30. KevinM said

    Keep going Jeff.

    The Darwin Zero post on WUWT is very important, it conveys the idea strongly to doubters.
    If you could follow it up with another, it will keep the train rolling.

    The world (more of it every day) is watching.

  31. Okay, I ran my analysis for GHCN, CRU and the daily records for the stations where I could find data.

    Sorted by their trend the stations look like this in GHCN:
    station trend
    X13 30555 -0.105258402
    X15 30636 -0.037822394
    X5 30230 -0.003176554
    X3 24908 0.013998778
    X17 30710 0.053390627
    X8 30372 0.072530530
    X21 30879 0.085711194
    X16 30673 0.096941880
    X4 30054 0.099595734
    X2 24817 0.103394048
    X6 30253 0.106219427
    X23 30965 0.106269797
    X11 30521 0.109922785
    X22 30925 0.119832298
    X18 30758 0.136873934
    X19 30777 0.139594920
    X 24507 0.146745813
    X10 30469 0.156795471
    X20 30823 0.159356034
    X1 24738 0.178645039
    X9 30433 0.210573022
    X7 30309 0.266606513
    X25 44241 0.363756159
    X14 30635 0.412515569
    X24 44207 0.424668840
    X12 30554 0.476778396

    And in CRU:
    station trend
    X10 30636 -0.05524033
    X3 24908 -0.03742590
    X7 30372 0.02657753
    X12 30710 0.04824257
    X11 30673 0.05401662
    X 24507 0.05519681
    X15 30879 0.10198393
    X2 24817 0.10224075
    X4 30054 0.13624512
    X13 30758 0.14493106
    X5 30230 0.14786168
    X18 30965 0.16480309
    X14 30777 0.17787170
    X1 24738 0.21610950
    X8 30521 0.23441882
    X6 30309 0.26081155
    X17 30949 0.30979171
    X16 30925 0.32540897
    X9 30554 0.47340926

    And from the GHCN Daily records:
    station trend
    X 24507 -0.09860173
    X5 30230 -0.06825661
    X3 24908 0.02783590
    X11 30555 0.03081327
    X13 30673 0.07054361
    X15 30758 0.11107082
    X2 24817 0.12834866
    X20 30965 0.18003243
    X18 30879 0.19755504
    X16 30777 0.24308219
    X8 30372 0.24329867
    X1 24738 0.28627422
    X12 30636 0.30923433
    X4 30054 0.31196364
    X19 30925 0.32589074
    X6 30253 0.34118679
    X9 30433 0.41396977
    X10 30521 0.42226809
    X14 30710 0.46900457
    X17 30823 0.62702715
    X7 30309 0.67767167

    The trends aren’t really comparable since they don’t cover the same periods but it does tend to show which have the strongest warming. Certainly the overall average shows some significant warming in all three sources.

  32. […] Siberia No politics here. Personally, I’m so sick of politics and climategate, you wouldn’t believe it. I like […] […]

  33. kdk33 said

    @ PeterS

    “As an analogy, if one were to measure the temperature of a person, you wouldn’t place the thermometer between the two fingers that a person is holding a lighted cigarette, then use statistical methods to work out the body temperature.”


    I’m of the opinion that computing a meaningful (in the context of AGW) global temperature parameter from thousands of (mostly volunteer) error prone weather stations, inadequately (wild understatement) distributed through both time and space, over which you have very little control, and about which you have incomplete information is a fools errand.

    I could be wrong.

    I’m very curious to see what Jeff and others discover.

  34. Bob said

    Love it, love it, love it. Excellent work.

    Jeff Id, when do you expect to finish your analysis and determine whether or not the CRU adjustments were justified? I want to use your research in my debates with the warmists, but the data must be rock solid.

    Thanks, Bob.

  35. Jeff Id said

    #34, I’ve got no idea. Right now I’m getting ready to run the pearl cru script on their modified data. I’ve never run Pearl before, but the code released is ungodly simple. It’s very nice that way. First I need to run it on their data, then I’ll translate to R, then run it on the raw data I can obtain for the same stations.

    Siberia, is part of this but it might have to wait.

  36. Kenneth Fritsch said

    I would like to see a serious discussion here on a proper statistical method of determining the confidence intervals for trends for temperature series such as we are analyzing here.

    I know one can look at the trend for a region by averaging/weighting the region’s station data and obtaining confidence intervals for a straight line regression of temperature versus years/months and applying an adjustment for the autocorrelation of the data.

    I could also calculate confidence intervals for each year/month by determining the average for that year/month for all stations and then determining the inter station standard deviations. But then how do I get from there to confidence intervals for the straight line trend from regression?

    None of the above addresses the uncertainty that involves the spatial coverage of stations for a given region – that has to be affected by the station to station differences in trends. One can, I suppose, make several different assumption about the distribution of temperatures in the areas between stations and look at each case.

    Would it be simpler to take all the measurement data together with any assumptions about estimating between station temperatures and do some kind of Monte Carlo calcualation?

  37. Earle Williams said

    Jeff, in #23 you mention the variations in baseline as resulting in an offset, and I tend to agree with you. If you were looking at annual temperatures then any change in baseline would only move your graph up or down the temperature scale by a constant amount. I wonder though if the recent decadal warming is occurring during specific months of the year if the offset assumption still holds true.

    I recall someone breaking down warming trends by month, but who it was escapes me at the moment. If for example there is a warming trend in winter but not in summer, are there any circumstances that would effect the shape or slope of the anomalized temperatures by using different baseline time periods?

    I can’t seem to reach a logical answer at the moment, perhaps insufficient caffeine. But since the anomalizing takes place on a monthly basis, it seems like there could be instances where there is a non-zero, and perhaps non-trivial, impact on the overall overall slope.

  38. DaleC said

    Hi Jeff,

    For station 30309, Bratsk, your plot starts from about 1900. But the file rs000030309.dly in ghcnd_all, downloaded and unzipped from

    has TMIN and TMAX only for some months in 1936, then from 1948 to 2000, and then nothing until 2008. Looking at the block of data from 13Feb1948 to 24Oct2000, I get 19248 days, of which for TMAX there are 12,630 data items (excludes missing and -9999), and for TMIN, 9,299 data items. To get the daily average needs both TMAX and TMIN, so there are only 9,299 usable pairs, which is 100*9299/19248 = 48%. How can monthly averages be reasonably computed when more than half the data is missing?

    This suggests that the monthly data you are using is not ‘original’. Personally, I think that all of these sorts of analyses should be done from the daily TMAX and TMIN. There are all sorts of horrors in the daily data which can be hidden in the monthly averages – things like -10 being recorded instead as -100 – so you need to eyeball a plot of the raw daily data, then check that wildly out-of-character values have been appropriately flagged, and then decide what to do with them.

  39. I agree DaleC – a look at the daily file is very enlightening in a lot of these stations.

    It is a pain to deal with because it increases the amount of data to process by more than a magnitude and you have to decide how to deal with the missing values. But when you try to compare the monthly series you don’t really know how many real observations from that station actually exist. In a lot of cases you’re looking at mostly in-filled data.

  40. DaleC said

    Oops I think that should have been from 1Feb1948 to 25Oct2000
    19,261 data points, 12,328 TMAX points, 9,093 TMIN.

  41. said

    Within the previous, most people viewed house cleansing solutions as a waste of cash, yet times have changed because a growing number of
    homeowners realized the benefits of availing this kind of solutions.
    Expense: $15 cleaning goods – Generating business leads and new
    clients – At this stage you would know who your clients are,
    how much you can safely cost them for the service and
    also you may have all of one’s cleaning equipment. When i figured out her plan I decided to attempt it, I called an unsuspecting victim, my brother who lives in Virginia, we had a great conversation, I even talked for 15 minutes about his wife who I can’t stand.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: