# the Air Vent

## Because the world needs another opinion

• JeffId1 at gmail dot com

Everything you need to know about Climategate. Now available at Amazon.com

• ## Follow Blog via Email

Join 179 other followers

## GHCN, Sorting a box of sox

Posted by Jeff Id on January 16, 2010

So I’ve been trying to do a little more QC with the GHCN data. The problem with the raw data is that a single station ID has multiple curves and very little explanation as to what causes the differences. All of the following curves were taken from the same instrument. Yes they have different data, yes they have odd steps compared to each other and unfortunately careful examination reveals that the data is a mess.

If you take the anomaly of this data and then average the trend has substantially less slope than if you first average then take the anomaly. Since these are the same data, it makes sense to take the anomaly after averaging. If you have different datasets, it makes sense to take the anomaly before averaging.

I spent about an hour this morning wrestling with Phil Jone’s version of GHCN which is somewhat different. I was able to determine that he used in the case above an average first then anomaly combination, which in this case made sense.

To that end, I’ve redone my own gridded global temperature. It’s still not properly area weighted but each step gets closer. I focused primarily on the proper identification of sub-timeseries inside GHCN temperature stations which are actually from the same instrument.

Here is my improved GetStation algorithm for GHCN data compilation. You pass the right station id and it performs some QC to determine the correct method to combine the data.

#combine staton data by insuring whether each series is in fact the same
getstation4=function(staid=91720)
{
allraw=NA
#raw data
noser=levels(factor((data[,3])))
for(i in noser)
{
dat=array(NA,dim=c(12,(endyear-startyear)+1))
for(j in 1:length(index))
{
dat[,index[j]]=as.numeric(subd[j,])
}
dim(dat)=c(length(dat),1)

dat[dat==-9999]=NA
dat=dat/10
rawd=ts(dat,start=startyear,deltat=1/12)
if(max(time(rawd))>=2010)
{
print (“error series”)
rawd=NA
}

if(!is.ts(allraw))
{
allraw=rawd
}
else
{
allraw=ts.union(allraw,rawd)
}
}

nc=ncol(allraw)
matchv=array(NA,dim=c(nc,nc))
mv=rep(NA,nc)
if(nc>1)
{
for (i in 1:nc)
{
for (j in 1:nc)
{
dt=allraw[,i]-allraw[,j]
{
if (mean(abs(dt),na.rm=TRUE)<.2)
{#data is the same
matchv[i,j]=j
}
}
}
}
index=1
for (i in 1:nc)
{
{#found match to already matched number
}else{
index=index+1
}
}
}else{
mv=1
}

srs=levels(factor(mv))
outar=array(NA,dim=c(nrow(allraw),length(srs)))
index=1

for (i in srs)
{
{
}else{
}

index=index+1
}
outar=ts(outar,start=time(allraw)[1],deltat=1/12)

outara=calc.anom(outar)
if(!is.null(ncol(outara)))
{
final=ts(rowMeans(outara,na.rm=TRUE),start=time(outara)[1],deltat=1/12)
}else{
final=outara
}
final
}

I hate what wordpress does to the formatting. The first section separates the data into timeseries as shown in Figure 1. This section was improved to recognize skipped years in the data.

After the data is in each series, they are all compared against each other to determine how similar they are. Since not many versions of the EXACT SAME data are an exact match, I used this line.

if (mean(abs(dt),na.rm=TRUE)<.2)

This line takes the mean of the absolute value of the difference. If the difference is on average greater than 0.2 it decides that it is not the same data. Consider what that means for our ability to detect 0.5C/century trends. I know of at least one instance where my eyes told me it was the same data yet it failed this test. I just wasn’t willing to loosen it further. Really it shouldn’t make much difference but wow it’s noisy stuff considering that it’s from the same instrument at the same time. Why is there any question at all whether the thermometer measured 25.5 or 25.1?

Anyway, after the algorithm decides that the data is the same, all of the ‘same’ data is averaged into a single time series before conversion to anomaly as you would sensibly do if you knew it was from the same instrument.

The last section of the algorithm is a bit complicated in it’s function. In the 8 series above, you get a list like series 4 is the same as series 1,3, 8 and in series 8 you get 1,4 so you need to combine evrything so that we know series 3 is also the same as 1,4 and 8. Anyway, when it’s all done if there are multiple “substantially different” series, they are then anomalized and combined into a single GHCN time series.

The data is then gridded and the north and south hemisphere are averaged separately.

Figure 2 - GHCN global Land Data. Black is this calculation, Red is CRU curve

A comparison to CRU is a little bit problematic as this doesn’t include any ocean data — 70% of the earth. It also doesn’t demonstrate the huge 1998 spike, don’t know why. Overall the pre-1970 data is substantially higher than the CRU version and consequently the trend has reduced century scale warming to about 0.45 C/century, although without ocean data this version shows more warming since 1980. It doesn’t make sense to look at century trend to me, but there has been less warming than advertised according to this curve.

At this point, there needs to be a lot more work on data QC and another look at urban vs rural. There are too many steps, oddities and station deletions in recent years to make any conclusions at this point but it seemed worthwhile to present the early results.

1. ### Pat Franksaid

Hi Jeff — a couple of questions. In averaging a multiple data set like Figure 1, how do you weight each series? Perhaps by the number of points it contains? Or, perhaps better, by averaging the separate time-wise sections, each scaled by the number of points it contains? Or what?

Also, how does one know that all the data sets at a given station represent a single instrument? Maybe they represent a few redundant instruments at one station. That might explain why several data sets from one station differ over the same time range, as in your Figure 1. If multiple sensors were located well apart, one could see that unique local exposures, wind gusts, and irradiance could lead to different readings on the same day. If those are liquid-in-glass thermometers, their resolution is probably not better than about 0.5 C anyway.

2. ### timetochooseagainsaid

Well, the 1998 spike is largely tropical, especially in the equatorial Pacific. In other words, “classic” El Nino Effect. Without that in the average, the spike should substantially be reduced.

3. ### Carricksaid

JeffID:

A comparison to CRU is a little bit problematic as this doesn’t include any ocean data — 70% of the earth

You’re comparing to CRUTEM3, right? That is land only.

It also doesn’t demonstrate the huge 1998 spike, don’t know why.

Probably related to your unweighted average.

Overall the pre-1970 data is substantially higher than the CRU version and consequently the trend has reduced century scale warming to about 0.45 C/century, although without ocean data this version shows more warming since 1980. It doesn’t make sense to look at century trend to me, but there has been less warming than advertised according to this curve.

The other thing to keep in mind is that anthropogenic warming only started, according to the models, in circa 1975.

Prior to that, anthropogenic CO2 and sulfates more or less balanced each other.

4. ### Geoff Sherringtonsaid

At 2, timetochooseagain said
January 16, 2010 at 6:18 pm

Do you have any quantitative information showing the location of the 1998 hot year anomaly to support your contention that it was mainly in the tropics? I had thought that an El Nino transferred warm water from the Indonesia hot pool toward Chile, a mechanism that does not of itself cause heating, just redistribution.

I don’t have the computing power, but if it was easy for someone else, it would be interesting to globally subtract year 1997 (or alternativel a 5 year averge from 1992-1997) from year 1998 and watch the week-by-week movement of the anomaly(s).

5. ### Geoff Sherringtonsaid

Hi Jeff,

Still concerned by changes about 1990 when IIRC the WMO decided on some procedures for reporting and selection and when the deline in use of a number of global stations started. I’m working mainly on Australia because I live here, so my comments might not apply globally. I float them to see if they do.

In the local context, I can report spaghetti graphs for places like Darwin, where the various adjusters seem to converge with their termperature curves as if they all agreed that there would be a date when all the ducks were in a line. This is pure speculation, but I see this 1990 +/- 3 years necking often. After that, the adjusters seem to diverge again.

Example –

Making the point that you possibly have secondary effects in your analysis that are hard to foresee, let alone correct. But you know that.

6. ### Edsaid

An easy to way use code files within the WordPress editor is to switch to the HTML mode and use the

code goes here

preformatted text HTML tag. Works for me.

7. ### Edsaid

That is to say – ack – I have no idea how to display HTML tags within the wordpress entry box. What ever, I mean to say use the
pre
tag with appropriate brackets and /pre to close.

8. ### Peter of Sydneysaid

I’m still waiting for evidence that any of the temperature rise over the past 100 years is anything unusual compared to rises in temperature over other 100 year blocks in the history of the earth. Sometimes I wonder why AGW alarmists are exempt from providing evidence to prove a theory and act as if the man-made global warming thesis is an axiom.

9. ### DeWitt Paynesaid

In the UAH LT Tropical anomaly (Trpcs) plot, the February, 1998 peak of 1.31 C is a couple of tenths higher than for NH, SH or Globe. Ocean at 1.33 was higher than land at 1.25 C.

10. ### Jeff Idsaid

#3, Thanks Carrick. I guess that means that since 1978ish, the signal is definitely in the GHCN dataset. That makes me interested in a few other data QC options.

11. ### Anthony Wattssaid

“I hate what wordpress does to the formatting.”

Jeff –
try the “code” tags for wordpress. See here

http://en.blog.wordpress.com/2007/09/03/posting-source-code/

12. ### Fluffy Clouds (Tim L)said

gotta try this
now what did i doo (shift and enter)
ok try o it out

13. ### Gardy LaRochesaid

Jeff,
you might want to consider applying the “pre” tag to preserve the format for imported test.

14. ### Fluffy Clouds (Tim L)said

wellll we are makeing a messss
2nd line should be lit up
pluss 3rd too
now 4th!?

well?

15. ### Fluffy Clouds (Tim L)said

if you use this tagging i might be able to fallow along

been 30 years for some of this programming lol

16. ### RomanMsaid

If I may add to Anthony’s comment, WordPress offers what I think may be a slightly better way to post R code than the simple “code” tag. There is a”sourcecode” tag which is described here:

http://en.support.wordpress.com/code/posting-source-code/

that works to keep the formatting suitable for cut and paste R scripts in WordPress (including keeping quote signs correct).

The option gutter = “false” will suppress line numbering (which makes the script basically unrunnable) and the option wraplines= “false” allows for making it somewhat more readable.

Try playing with it a bit – it may help.

17. ### RomanMsaid

Oops, seems like I cross-posted with Tim ( I type slow. 😉 ). Delete my post if you wish, Jeff.

18. ### Fluffy Clouds (Tim L)said

roman lol
keep his and snip mine lol
his is better plus gutter
nite nite

19. ### Carricksaid

Jeff, I use <pre> to include source code in my lecture notes.

I made this point over on Lucia’s blog: Your black line shows absolutely no warming before 1980. Now recognizing that this is work in progress, the point to be made is, as you say, there is a signal in the GHCN data for AGW even for your data.

In fact, for the black line, that’s the only warming signal there is. Although the GISS and CRU people have been highly criticized for doing so, when they increase the temperature of 1998 relative to 1934, what they are doing is increasing the importance of natural fluctuations relative to the AGW signal. That is, by making 1934 colder, they are diminishing the relative importance of the AGW component of warming to other natural temperature fluctuations.

Whether they realize what they are doing is a moot point, because this is what they are really doing.

20. ### timetochooseagainsaid

Really, what probably would have been better phraseology is that the spike is most prominent in the tropics and the ocean.

I can’t provide exactly what you may be looking for, however I have often wondered myself how a change in sea surface temperature patterns changes the global mean. I just do what climatologists are fond of when explaining this: I handwave and say “teleconnection”!

Jeff,

The problem with the raw data is that a single station ID has multiple curves and very little explanation as to what causes the differences. All of the following curves were taken from the same instrument. Yes they have different data, yes they have odd steps compared to each other and unfortunately careful examination reveals that the data is a mess.

I was working on something similar awhile back (only to get distracted and never post on it). I found the GHCN documentation (v2.temperature.readme) a little confusing. So confusing in fact the way I initially read it as: There are multiple time series for a location (by lat/lon coordinates) which are read at different nearby locations.

As far as combining the data is concerned, I achieved this by using the first difference method. I was studying data by country (provided the country was ‘small’). I found some weird results. In one country, using the unadjusted data, the urban stations warmed slower than the suburban, which in turn warmed slower than rural areas. I guess you can call this the inverse-UHI effect? I might soon take a break from what I’m studying now (radiosondes and MSU) to post my findings.

22. ### timetochooseagainsaid

21-That would be interesting to see. It wouldn’t surprise me if the behavior of of some rural versus urban stations didn’t seem to make any sense.

For one thing, I would imagine Urban observers are more able to afford maintenance. Stevenson Screens don’t paint themselves as the stuff chips off you know.

And the paint has certain properties, of course…

23. ### Geoff Sherringtonsaid

rE 9 DeWitt Payne

Thanks for the convenient summary of UAH data. Let’s trust the data 100% and see where it leads us by the simple method of subtracting 1997 temperatures each month from 1998 temperatures.

Globally: The total, ocean and land values move in unison. Inference – there is not enough time for the sea to warm the air then warm the land. It looks more like an external warming affected the globe.

Highest: The greatest increase 1998-1997 within the 25 columns of data is for the USA 48. This has the same 4 blips in the year as I have illustrated for other places, not quite so clear, but there.

Lowest: The South Pole combined, SPole land and SPole ocean are all negative. 1997 was hotter than 1998 for these bands.

The oceans: In order of size of hottest to coldest change, NPole, NoExt, Trops,SoExt, SPole. Inference: The heat change did not start at the tropics, but at the North Pole.

The land: In order of hottest to coldest change, Trop, NoExt, SoExt, NPole, SPole. Inference: One heat mechanism operated for the ocean, another for the land.

The NPole temp difference illustrates the 4 peaks noted above, beautifully, with a max difference range of +2.3 to -1.3 deg C peak to trough.

Conclusion: There is a lot more than meets the eye about explaining the 1998 peak by assigning it to a strong El Nino. There is inferential suggestion that there was a global input of heat and that it occurred in pulses about 3 months apart.

Caveat 1: The simple method of subtracting 1997 from 1998 might produce artefacts. More study will proceed.

Caveat 2: This method of data treatment might be interacting with satellite data processing math.

24. ### Espensaid

In your previous post, I found the long-running rural-only graph especially interesting. Do you have a new version of that as well?

Great work, btw. You do the work of an entire “climate research center” – you should apply for multi-million dollar funding 😉

25. ### Lucy Skywalkersaid

I found lots of versions for many of the stations in the GISS records too. I’m not happy with just averaging them. I think we may need to go back to individual station records and record-keepers to find out what happened, and why there are so many “record versions”.

We have here a mish-mash, all of which is questionable until we know its provenance. Several or even all “records” may actually be already-tampered-with, partly-adjusted material. I’ve been reading an absolutely brilliant redaction of the whole Climategate emails sequence, a Gotterdammerung, Twilight Of The Gods – a nail-biting saga – and what’s clear is that they really have lost, forgotten, or tampered with records, without proper record of such, and likely beyond repair except by fudge.

I would like to see a “recovery of Science” team go back to just a few individual global records that can be traced for a long period of time, with site history and individual current scaling for UHI (eg do urban transect). Better a few that are clean than a lot that are a mess – forget the gridding for the mo. I think these will be the best global pointers we can manage – and I think we can manage well enough. My comparisons of the thermometer records circling Yamal – Salehard show a high degree of internal consistency. Only the last few years cause me to suspect problems – and this shows up brilliantly on the Salehard seasonal graphs – it is winter that has gone rogue vis a vis the other seasons – district heating anyone??

26. ### curioussaid

re. 23 – Sorry if this is a bit tangential but I wonder if there is any line of investigation re: the temp. distributions which would be informative? I realise this is not in the same vein as Geoff’s exploration of outliers but thought it might be of interest. Apologies if it is known territory/not relevant.

For an individual long record station with good quality data I wonder if comparing successive year’s temp. distribution would yield any patterns? This is prompted by wind data being characterised by a Weibull function with shape and scale parameters. Changes in wind regime from one year to a next will show in the parameters of the function and I wonder if this is relevant to the temp. data. IMO they are both time series data which reflect an energetic content in the environment. Could it help identify temporal and regional similarities and differences in temp. regimes? To my mind this is a possible parallel line of investigation to looking for “signal” in the time series data.

I think there was some mention a while ago on CA re: the assumption of normal distribution of temps around a mean but I don’t recall the detail – brief Googling didn’t find it.

27. ### BarryWsaid

It may be that what you are seeing is the time varying nature of rural vs urban. I live outside of Washington DC In describing where I was working at the time my parents (who lived here in the 40’s) said: Oh you mean at the airport? Where there had been an airport (for small planes) was now a series of shopping centers and high rises (asphalt everywhere). UHI isn’t just for cities! Secondly, as Anthony Watts has discovered, changes in sensors, sensor location, and micro climate may contribute to the rise (example: MMTS sensors which have to be located closer to buildings because of cable runs).

28. ### Jeff Idsaid

First difference creates a problem with monthly temperature data due to the level of weather variance. After integration, the offset for the entire anomaly data series is essentially the first existing month delta. The offset is often over a degree or more different than you would get if you took the filtered data first difference. This is a big problem with using first difference in anomaly data.

In the version below, I used more of the available co-existing points from the next nearest instrument to insure that the anomaly is introduced with a proper level of offset.

https://noconsensus.wordpress.com/2009/08/31/area-weighted-antarctic-offset-reconstructions/

If you check this link, Figure 3 shows the continental trends created by 63 surface stations using various amounts of overlapped data to set the offset value. The far left side of the x axis corresponds to the first difference on unfiltered data. A negative continental trend was found which is inaccurate when compared to simply using non-offset anomaly data (Figure 1), RegEM, examination of individual surface stations, or other iterative MV methods.

As more co-existing values are used, the noise of introducing new surface station anomalies became less of a factor and figure 3 shows a gradual stabilization to a trend of a similar range to some of the fancy methods but more importantly in our case, it shows a good match to the spatial distribution of the other trends. You can see from this though, that we couldn’t obtain a reasonable result from first differences.

29. ### NicLsaid

Jeff

Excellent work on improving the QC on the GHCN data, and I am glad to see that you fixed the problem with the original getstation function where there are gaps in the data.

However, I don’t understand why you only use the nearest WMO number part of each station number in getstation. GHCN has data for 7280 different stations, even though there are only 4495 distinct WMO numbers. Non-WMO stations are given the WMO number of the closest WMO station, with a non-zero modifier (field 3). Non-WMO stations also can have duplicate series. Separating out each of the 7280 separate stations before trying to combine duplicate series and perform QC simplifies those tasks.

I agree that it is very odd that substantially different data series often exist for a station. I have been using a simpler QC method than you, taking the mean of all series belonging to each of the 7280 stations but marking as NA months where the standard deviation between points in duplicate data series exceeded 4. Your method may be better.

I would like to share some results that I have obtained using just long-record stations, which avoids possible distortions due to changes in the data set when building a long mean series from short record stations.

There are 1034 GHCN stations (not all WMO) with largely complete raw data for 1900-2005. Combining their records, unweighted, produces a raw series with a trend over that period of 0.0269 (all trends are stated as Deg C/ Decade). (I stopped at the end of 2005 as data for a huge number of Indian stations ended in March 2006.) As there was little change over the period in the set of stations with data, the trend of the combined series is bound to be close to the mean of the trends of the individual stations (which was 0.0256).

The corresponding trend in adjusted data for the 764 stations with largely complete adjusted data for 1900-2005 was 0.0536 for the combined series, double the trend in the raw data. Strangely, the trend in mean raw data for those 764 stations was only 0.0133; the adjustments appear to have quadrupled the trend in the raw data. I don’t know why. I would have expected sizeable adjustments for growing Urban Heat Island effects, which would reduce the trend from that in the raw data.

Of the 1034 GHCN stations with largely complete 1900-2005 data, 484 are classed as rural. Combining their records produces a raw series with a trend of only 0.0117.

In all cases, the confidence interval of the trend was a multiple of the trend itself.

The long record stations are concentrated in the USA, although there are 202 such stations elsewhere with raw data. Those stations are quite widely spread. The raw 1900-2005 trend for the 832 such stations in the USA is 0.0144, whilst for the 202 stations elsewhere it is much higher at 0.0805. However, only 28 of the 202 non-USA stations are rural, compared with 456 of the 832 USA stations. The difference in USA and non-USA trends (amounting to 0.7 deg.C over the 105 years) could largely be due to the UHI effect.

30. ### Jeff Idsaid

#29, Very very interesting, I’ll be working on some paper or other today ;), I wonder if we could humbly trouble you to present your results as a post with some graphs? It sounds like an excellent chance for discovery of some of the deeper issues with the datasets.

31. ### Layman Lurkersaid

#29

Only 28 long record stations outside of US are rural? Astounding.

32. ### Kenneth Fritschsaid

Jeff ID, when you say there are duplicate station reportings from GHCN of as much as 5 per station, I am puzzled in context with what I found and reported some time back from my extraction of GHCN data from KNMI. I found 23 stations that had dulicated latitude, longitude and altitude and I assume that these are data from the exact same station location.

There are stations that are designated WMO stations and then there are nearby non WMO stations nearby that have the same station number with an ending suffix following a decimal point. Many nearby stations are very nearby.

By the way, I am in the process of doing station comparisons within a 5 X 5 degree grid area and will now use GHCN data directly from GHCN and not filtered through KNMI. This will allow me to avoid some manual downloads from the KNMI climate data repository.

Jeff,

This is a big problem with using first difference in anomaly data.

Just woke up a while ago so I may not understand exactly what you mean. Do you mean applying first difference to anomaly data? I applied first difference to all the stations, averaged them together and calculated the cumsum.

34. ### Jeff Idsaid

#33 if you think of just two stations, one long and one short record. Both stations measure with weather noise in them but we’re interested in a trend. As the new (short) station is introduced, the first month delta provides a step which is added to the entire series – essentially this becomes the offset constant when the anomaly is re-summed.

We hope this step will be balanced with enough stations that some are positive and some are negative and we can overall get a good trend. However, this step has a very large effect on a 63 station trend as shown in the link I gave because monthly anomalies of over 5 C anomaly are not uncommon and that has a great effect on a 0.5C trend.

It may be true that on a global record there is enough data to cancel these effects out, but over a country with under a hundred stations, the weather noise creates big problems with the introduction and removal of individual series.

#34
I haven’t observed a step in the data or the experiments I’ve done. The danger I’ve found in the method is that it creates very noticeable changes in variance. As more stations are added, the variance goes down. Then when a lot of them disappear, the variance gets larger. Not sure how to deal with this issue.

36. ### NicLsaid

#30. OK, I have being doing as you ask and will send you something, hopefully fairly soon. Not sure how deep it will be. But lots of graphs, unlike my first post at TAV last year. 🙂

37. ### Lucy Skywalkersaid

#29 NicL I really look forward to seeing your work. I’ve always tried to use the longterm stations. GISS ALWAYS shows trend in adjusted higher than trend in raw, exactly the opposite of what has got to be the case because of UHI. It seems they cannot “correct” current temperatures so instead they invent past temperatures that are systematically depressed. I’d like to see a few longterm individual stations – station history plus UHI by urban transects. Then we could see Europe’s temperature changes for a couple hundred years as well as the world for a century. I really mistrust the “gridbox” system as they’ve done it.

38. ### Geoff Sherringtonsaid

26.curious said
January 17, 2010 at 7:50 am

The study of distributions after adjustments can be found at
http://www.gilestro.tk/2009/lots-of-smoke-hardly-any-gun-do-climatologists-falsify-data/

When I started to examine it, the first stations I looked for a negative trend correction were Australian and they were stations where early parts of the data series had been rejected by the BOM here. The long data are still available and Giss used them despite their partial rejection on quality issues by those who generated them. I did not persevere. You might find the same outcome.

Re looking at outliers, the method of subtracting one year from the next was chosen purely because it’s so easy. There are several problems, such as the choise of a start point (e.g. for looking at the hot 1998 it might be better to use years from Oct-Sept because the temp rise started before Jan 1998) and whether one year (not say 2 or 5) is a good choice for basic time unit. At this point I stress that much more can be learned from the 1995-2000 period, both in rate of temp increase and decrease and in direction of movement of hotter material over the globe. Over to you smart stats guys.

Bob Tisdale’s animations are most useful because if you look at how fast the red replaces the blue globally (weeks, not months) then it starts to eliminate some mechanisms. It’s interesting that the analysis above was done on UAH lower trop satellite data, but it’s in good agreement with patterns shown in detailed over-land and over-sea dissections.

The main problem is Steve Mac’s ever present call for raw data. Some compliance would help.

39. ### NicLsaid

#37. Lucy,

Thanks. I have recently sent the piece to Jeff, so hopefully it will appear before too long.

I also have doubts about GHCN’s adjustment process, which is an issue I cover in my piece. It seems to adjust for discontinuities but not for gradual distortions produced by processes like the UHI effect. But as a data aggregator I suppose that they have limited knowledge about the histories of individual stations and have primarily to use purely statistical methods, which are unlikely to be able to detect gradual distortions.

Researching the history of individual long record stations, particularly the very limited number of such stations outside the USA, and estimating what adjustments should be made to reflect changes, would indeed be a valuable exercise. Perhaps we can suggest this to Anthony Watts as a new project for the surfacestations.org project. 🙂

40. ### boballabsaid

#38

Dr. Peterson at NCDC sent an Email to Willis E. about the Darwin situation back in December and Willis put it up on WUWT in his Darwin thread but it was at the bottom of the comments and right before christmas. With that said Dr. Peterson pointed Willis towards the papers they based GHCN’s adjustments on but he stated that come this Feb or Mar NCDC will have completed an update to GHCN with the difference being that they are going back and using the way they adjust USHCNv2. Here is the pertinent part of the email:

Partly in response to this concern, over the course of many years, a team here at NCDC developed a new approach to make homogeneity adjustments that had several advantages over the old approaches. Rather than building reference series it does a complex series of pairwise comparisons. Rather than using an adjustment technique (paper sent) that saw every change as a step function (which as the homogeneity review paper indicates was pretty standard back in the mid-1990s) the new approach can also look at slight trend differences (e.g., those that might be expected to be caused by the growth of a tree to the west of a station increasingly shading the station site in the late afternoon and thereby cooling maximum temperature data). That work was done by Matt Menne, Claude Williams and Russ Vose with papers published this year in the Journal of Climate (homogeneity adjustments) and the Bulletin of the AMS (USHCN version 2 which uses this technique).

Everyone here at NCDC is very pleased with their work and the rigor they applied to developing and evaluating it. They are currently in the process of applying their adjustment procedure to GHCN. Preliminary evaluation appears very, very promising (though of course some very remote stations like St Helena Island (which has a large discontinuity in the middle of its long record due to moving downhill) will not be able to be adjusted using this approach). GHCN is also undergoing a major update with the addition of newly available data. We currently expect to release the new version of GHCN in February or March along with all the processing software and intermediate files which will dramatically increase the transparency of our process and make the job of people like you who evaluate and try to duplicate surface temperature data processing much easier.

Note: they are going to release the computer code which will make Jeff a happy camper. However there is a little controversy about the paper the adjustments are based on. Dr. Pielke Sr is a little taken aback by it:
http://pielkeclimatesci.wordpress.com/2010/01/15/professional-discourtesy-by-the-national-climate-data-center/

From Dr. Pielke’s post he has a link to a PDF copy of the paper.

41. ### KevinUKsaid

If anyone is interested in looking at ‘global warming’ in full colour then take a look at this new thread that I’ve just put up on ‘digginintheclay’.

Mapping global warming

The main conclusion reached in the thread is that global warming is hardly global and that based on the evidence shown in the colour coded trend maps I’ve presented in the thread, ‘global warming’ is not global but is in fact largely NH winter warming. I’ve stated that given what the maps show, it’s hard to see how CO2 could be the cause of this warming unless the demon CO2 is happy to allow notable exceptions and choosesy to selectively warm certain parts of the planet while allowing other parts to cool at the same time.

I’ve suggested that Western Australians apply for a rebate on their carbon taxes and have also recommended where ‘pommies’ like myself should all go if we want a good tan this summer.

Regards

KevinUK

42. ### DeWitt Paynesaid

…‘global warming’ is not global but is in fact largely NH winter warming.

If I understand the theory correctly, that’s exactly what one would expect from AGW, lower diurnal temperature range because of warmer nights and lower seasonal range because of warmer winters. It shows up more in the NH because the seasonal temperature swing is larger in the NH, more land and less ocean than the SH.

43. ### Carricksaid

That’s my understanding too.

44. ### Geoff Sherringtonsaid

Today’s work on one year lag from the 1998 hot year was a bit of a dud because for some reason unknown, I cannot get coloured lines from Excel graphs to carry through the image hosting process, so they come out black. However the dots retain their colours and you can trace them with difficulty. Here they are, one for land and one for sea. The key line has the aqua colours for above-tropic records.

Please let me know if you find a Rosetta Stone.

It seemed appropriate to post these on another blog also, as I have just done,so please excuse the impoliteness of sleeping around.