Trend Calculation Incorporating Seasonal Offsets
Posted by Jeff Id on February 23, 2010
In this post, RomanM has proposed an interesting concept which was explained in very much approachable terms for us technical folks, I’ve attempted to employ it and explain it further below. Rather than using a fixed offset value or no offset for averaging sevearal temperature anomaly series, he’s used a seasonal method which calculates a different offset for each month of each series. The goal of this method is to combine multiple temperature stations into a more accurate trend curve.
Simple tech side – Roman’s post regards an improved anomaly offsetting calculation which offsets two different temperature stations according to their monthly anomaly differences. The process is a least squares knitting of two temperature time series by month rather than a single least square minimized factor by the total available overlapping data in two or more series.
The rest of us – Roman’s method makes sense when you consider the different climate of various station positions. For instance a station on the East side of lake Michigan (right on the shoreline), experiences winds from the lake for the vast majority of the year. During the spring, summer and fall months, open water moderates the air temp that the station experiences. The result is temperatures which are closer to water surface temperature whereas an inland station air temperature will see more daily/hourly variations. In the winter months, the lake freezes substantially insulating the air from the water surface and therefore the water has less heat conduction and effect on temps. The net result is that the anomaly differential from shore to inland has a seasonal component.
To take this seasonal variance into account, Roman applies 12 offsets to correct by month separate station anomalies. He is the first I’ve seen address this in instrumental temperature record — not that it hasn’t been done somewhere else.
I used a slightly modified version of the getstation4 algorithm which was used in my previous GHCN post. The algorithm automatically sorts stations to see if they are from the same instrument or different. If they are from the same instrument, an average is taken. Normally, if they are different instruments, the algorithm calculates the anomaly and then averages the data into a single series. In this case, the raw data is returned in multiple series.
The function call for the first three stations of GHCN is:
Tem is a time series which contains 4 timeseries shown below.
Combining these series isn’t a straightforward matter. Since the algorithm has already determined that these four series are from different temperature stations though, one method might be to simply take the anomaly and average the series. This is what I used in my recent GHCN gridded reconstruction.
I fed the same data into Roman’s algorithm below.
Not terribly different but it has a slightly higher trend. Below is the difference.
If seasonal variation is taken into account, we can achieve a final result will be superior in accuracy to a standard non-seasonal anomaly combination. The variance in the offset curves above, represents the existence and correction of seasonal signal in the residuals found in knitting different temperature station anomalies together.