Posted by Jeff Id on March 17, 2010
This is just a short post on why people like Roman and Tamino are interested in calculating the offset for temperature anomaly calculation and why I’m interested in applying it to global temperature. I hope it can put to rest some of the anomaly aversion that exists in skeptics.
I’ve already finished the global temperature calculations and am just working on documentation and verification. My guess though is that there are some readers who don’t understand the obsessive efforts in offsetting anomalies. First, there are plenty of people who consider themselves to be data purists, who sometimes state that only raw temperature should be used to calculate trend. In their view, anomaly is a processing step and is therefore suspect. While the purity and minimal processing is a fantastic goal, the anomaly calculation is required to compute accurate trends when data starts and stops at different points in a seasonal cycle and is equally critical when data is missing. Consider that a typical temperature curve looks like Figure 1.
You can see the gaps in the series. When you consider that we’re trying to detect tenths of a degree, it’s not difficult to imagine that if gaps mostly happen at the tops of the pseudo-sine wave your trend will be altered. Therefore, it’s appropriate to find another method. In this post ‘anomaly’ is calculated by averaging all the Jan’s together, all the Feb’s …etc.. and subtracting the monthly average. The result of anomaly calculation is shown in Figure 2.
The data gaps are closer to the mean temperature of the series now, and when you consider the vertical scale, they will have a reduced effect on trend compared to Figure 1. Think of anomaly as the deviation of each January from the average January and each February from the average Feb. If 1 C of true warming occurs and we have a complete record with endpoints on the same months as the start, we will get close to the exact same trend with either raw data or anomaly methods.
The result is still not exactly the same between raw and anomaly because trend is a sum of squares of deviation from the fitted line (that’s another topic), but it’s going to be close enough considering the other errors in the dataset. Very close in fact. The anomaly is a better method in any case because the extreme hot and cold months of summer and winter won’t have as great an influence on least squares trend.
So that said, one of the properties of anomaly is that each series is centered (average = 0) on the vertical scale around the timeframe in which it was calculated. In this case, the anomaly is calculated for the entire length of the available data such that it’s centered around the mean of the entire series. In HadCru temperatures, there is no code for offsetting anomalies. They are simply averaged together to create a global trend. So why would Roman, Tamino and guys like Ryan, Nic, SteveM and even I spend so much time working on proper combination of temperature anomalies to look at trend? It’s because there is a substantial improvement in trend accuracy to be made.
Below is the simplest example I could think of. A perfect linear trend measured from two temperature stations. Both stations measure the exact same temperature trend, the record is over a time period of 100 years. One station starts in 1900 at an absolute temperature of 0.001C having a linear increasing trend of 0.12C/Decade until 1980, the second starts in 1960 but is at a higher altitude and measures the same noiseless value as the first plus an offset of 0.5 C, the trend is of course the same, and the second station continues to the year 2000.
To say it another way, both stations have exactly the same slope, with an offset. The bottom station of Fig 3 is the same as the top except that it’s been cut short and had an offset of 0.5 C added to it as you would expect from two close by stations with the bottom one in Figure 3 at a higher and slightly colder altitude. If we take a simple average, we get Figure 4. – sorry for the title.
We have defined the trend of both stations as 0.12 C/Decade, yet when they are averaged as raw, the offset due to altitude creates an incaccurate and more positive trend of 0.179C/Decade.
If you can read a bit of code, both stations are derived from the exact same series, the second with a 0.5 value added on. So we really do know the true trend is exactly 0.12C/Decade.
<pre>pp=(1:1200)/1000 #create 1200 month series called pp pp=ts(pp,start=1900,deltat=1/12) #turn it into months from year 1900-2000 pg=window(pp+.5,start=1960) #second series starts in 1960 - 0.5C OFFSET pp=window(pp,end=1980) #first series ends in 1980- 20 years of overlap with same values oo=ts.union(pp,pg) #join into two column time series
The following code centers the two series about their means as would happen with an anomaly calculation.
cm=colMeans(oo,na.rm=TRUE) #calculate mean of each series oo[,1]=oo[,1]-cm #center row 1 to simulate anomaly oo[,2]=oo[,2]-cm #center row 2 to simulate anomaly
It subtracts the mean ‘cm’ from each column of ‘oo’ so series 1 and 2 are plotted in Figures 5 and 6. Note the perfect 0.12 C/Decade trend of each series (plotted individually below) despite the raw average calculated in figure 4.
Also take note that the mid point is centered at the halfway point in the series. It’s the same with an anomaly calculation, the series become centered around zero.
So now that we have centered the series according to a process which is similar to anomaly calculation, what would the trend from a simple no-offset average look like? This next step is equivalent to the methods of the Phil Jones CRUtem series and likely equivalent to GISS. I’ve never heard of an offset being used in the GISS series but haven’t done it myself. — Say I’m 99 percent confident that GISS uses simple averages of anomaly as well. Since we know that simple average of raw temperature stations causes problems, what does simple average of anomaly mean?
In this case our two 0.12C/Decade perfectly continuous trends get sawtooth steps in them. So, if we just average up-sloped anomaly series together the trend is reduced. I hope some of you who may not have considered this effect, take the time to think about this. In every single case where the true trend in data is an upslope, the introduction and removal of temperature series in simple anomaly averaging, works to REDUCE the trend of the TRUE data. — It’s a very important point- for a couple of reasons.
Next though, I looked at the RomanM version of offset series calculations. This method insures that the series are regressed to a constant offset value which re-aligns the anomalies. Think about that. We’re re-aligning the anomaly series with each other to remove the steps. If we use raw data (assuming up-sloping data), the steps in this case were positive with respect to trend, sometimes the steps can be negative. If we use anomaly alone (assuming up-sloping data), the steps from added and removed series are always toward a reduction in actual trend. It’s an odd concept, but the key is that they are NOT TRUE trend as the true trend, in this simple case, is of course 0.12C/Decade.
So let’s use the RomanM data “hammer” on our non-seasonal data. It’s non-seasonal because it’s a perfect linear trend and Roman is a step beyond the concepts in this post in that it offsets not only by series with a single offset value but twelve offsets by month. In the example above, each month has a perfect noiseless trend and therefore the same offset for each month, so his sophistication is not required. However his stuff hammers this nail just fine. The code uses least squares fit to calculate the best match of one series to another and determines which value to add (plus sign) to each series for a best match to each other. It’s an offset calculator.
As with a proper hammer, the code call is very simple.
offsetvalues=temp.combine(oo) #call Roman's offset function call plt.avg(offsetvalues$temp,main.t="Row Average",x.pos=1910,y.pos=0.2) #Plot it
So when we calculate the best possible offset of each series together, the result is shown in Figure 8.
And there we go, a perfect trend by offsetting the two series to match.
Now if you’ve followed the logic above, consider this point. Phil, Climategate, Warming Is Doom, 22 million dollars of grants, Jones, has not used offset anomalies. I’m an engineer and as my portion of our recent Antarctic publication, and self appointed skeptic of month for November 09, I calculated the continental trend using area weighted offset anomalies. This method increased the trend from 0.05 (simple average) to 0.06. Consider that Ryan and Nic employed a sophisticated iterative algorithm with the intent of determining the proper offsets and weights for the Antarctic surface station anomalies based on weather patterns. And finally consider that IF the offsets of anomalies are not used, and your data has a natural up-slope, YOU ALWAYS get a lower trend by simple anomaly average. We’re busted!! The skeptics/denialists/disinformation spreaders are working to actively and endlessly to increase the trend in surface temperature.
So, we are forced to realize that this is the method for creating the proper trend, as Tamino did a good job of, and in the case of an upslope in general data, the trend is definitely going to be greater than simple averaging. Knowing further, that the climategate boys are head over heels advocates for massive warming, and knowing that models predict more warming than temperature measurements do, what does it mean when they don’t figure out to do a proper anomaly offset, but the evil denialist skeptics do? How would Santer 09 read if CRU was done with offset anomalies?
However, the offset methods are a more accurate representation of the temperature trend. Again, skeptics of AGW must remember the lesson that the advocate crowd has thrown to the wind. We do not get to choose the results of math!! Personally, I would much rather work with good math and true results, no matter what they say, than the Mannian hockeystick.