the Air Vent

The Antarctic, an engineers reconstruction.

I’ve been gone for the last couple of days, there are a number of interesting comments that I will get to soon but I’ve been thinking about a different kind of reconstruction math using the same data. Several people have suggested different methods for reweighting surface station data. Several of these methods are in my opinion superior to RegEM. The big problem I have with Dr. Steig’s paper is the complete lack of verification. There is absolutely NO attempt to verify if long term trends are copied in a sensible fashion. Of course a primary assumption of the reconstruction is that high correlation signals must have the same long term trend. I’ve done some preliminary analysis which demonstrates that the long term and short term correlation of the satellite and ground signals do not match.

In addition, Jeff C and Ryan O have found very real reasons to question the trends in the satellite data HERE, HERE and HERE. There are apparent trends and steps which occur between different satellites and instrument packages. Although it isn’t proven yet, my belief is that the satellite data is likely unusable for reasonable trend analysis. This doesn’t mean the data doesn’t have a use though because the higher frequency monthly temperatures are still available and still show high correlation to individual temperature stations HERE.

So a potential (and likely) problem in Steig09 is the exclusive use of satellite data in the reconstruction post 1982. There are several other problems with RegEM as well including the fact that negatively correlated (unrelated) reconstruction points actually receive a negative trend rather than a zero trend. So peninsula stations which have a strong negative correlation to other parts of the continent are definitely smeared, but they are smeared in a negative sense (Figure 1).

Figure 1

This situation is no better than a positive smear if you want to know trends. Without going into the other problems too deeply today, I’d like to present a reasonable solution.

The high frequency information from the satellites is of good quality according to our correlation plot, surface trends are of the best quality we have available for ground data. The goal is of course to distribute surface stations spatially according to their area of influence across the antarctic

My thought was then to correlate the 42 surface stations with the satellite’s 5509 grid points and use the correlation to the 42 stations to weight each surface stations influence on each of the 5509 grid locations and reconstruct an all surface station trend.

The correlation coeficient from wikepedia is:

Where x_i is the surface data and y_i is the satellite data. The reason I put this here is because I wanted to show the linear nature of the numerator. I used a linear method of weighting to calculate each gridcell of the reconstruction based on the linear response of correlation.

For each grid point the caclulation looks like this in the simplest way I could write it.

reconstruction [at this point] = (sum [ eachsurfacestation * each station correlation to sat]) / sum[each station correlation to sat]

I hope that’s more clear than a typical math equation. This is actually the form of a centroid calculation where each variable is weighted and then weights are divided out. I want people to appreciate the simplicity of the method.

There are problems in the equations due to missing values. First you need to mask missing values just to calculate the correlation, then we need to keep track of the sum of the correlations used in the above equation for each month because each month of surface station data has a different set of missing values. Also there is the problem of 100% missing values, how is that handled? Well I worked out the details below which will be presented elsewhere.

Here is a simple R script which does the job. My first R script was beautiful but it looked like a nice C algorithm. Unfortunately after testing it would have taken over 8 hours to run so my first plan didn’t work at all. After re-writing, this version runs in 10 – 15 minutes. First I calculated the correlations, we can only correlate post 1982 data so latesurf is created from the anomaly.

#calc correlations sat to surface
cors=array(0,dim=c(5509,42))

latesurf=window(anomalies$surface,start=1982)

for(i in 1:42)
{
mask=!is.na(latesurf[,i])
if(sum(mask)!=0)
{
cors[,i]=cor(latesurf[mask,i],satraw_anom[mask,])
}
else
{
cors[,i]=-10
}
}

And the weighting algorithm.

#build matrix based on correlation weighitng
recon=array(0,dim=c(600,5509))
recon=ts(recon,start=1957,deltat=1/12)

for (i in 1:5509)
{
numbval=array(0,dim = 600)#number of values per month
sumcors=array(0,dim = 600)#number of correlation values per month

cormask = (cors[i,]>=.2)
mask= !is.na(anomalies$surface[,cormask])

tt=anomalies$surface[,cormask]
corsum= t((cors[i,cormask])*t(!is.na(tt)))
colcorsum=rowSums(corsum)

mul=cors[i,cormask]
tt=tt*mul
if(length(tt)>1199)
{
anom=rowSums(tt,na.rm=TRUE)/colcorsum
}else
{
anom=tt/colcorsum
}
if(sum(cormask)>0)
{
recon[,i] = anom
}else{
recon[,i]=NA
}

}

After the trends are added together and reweighed according to their correlation with high frequency satellite data the total trend for the antarctic looks like this.

Figure 2

The spatial distribution of the trend is more interesting. Figure 3 is the trend in temperature vs location.

Figure 3

The Antarctic shows a slight net warming of 0.084C/decade according to the surface station data for the 50 year timeframe. Sorry guys, I did warn everyone that if that’s what the data shows that’s what we’ll see on tAV. What I find most interesting is that the peninsula warming trend is so incredibly isolated to a single region with a hard cutoff. This makes me question the peninsula station data. I ran this algorithm with several parameter changes and got trends usually arounc0.073 C/Decade.

The trend of 0.084 C/Dec is substantially flatter than Steig’s reconstruction, although it is within Steig09’s huge margin of error 0.12+/- 0.07 C/Dec. I do have to point out that this new value is very close to Jeff C and my re-gridded trend.

Another point is that we know ice levels have grown since the advent of satellite measurement. Despite the numerous explanations by pro-AGW scientists this is most likely due to cooling temps. From Fig 2 you can see values prior to 1967 are significantly negative by eye. Since the RC guys have admitted that Antarctic warming is now consistent with the models, I want to know what happened in the last 40 years where the most CO2 was added to the atmosphere

Figure 4

The trend is flat for the last 40 years. I don’t know what this means other than I’m sure this is also consistent with the models. The spatial distribution looks like this.

Figure 5

I called this post an engineers reconstruction, it is a bit false in that I’m not close to comfortable with the verification. I do like the method because it is so simple but there is confirmation which needs to be done. Of course Steig09 team did almost no reasonable verification but that doesn’t exonerate me. I have run more than 20 variations and am certain that the robustness is excellent for the time periods and certain parameters, however the level of effect GISS trend corrections have is unknown. In RegEM the GISS trend corrections have less effect because I now believe the trends in Steig09 are in large part artifacts of the math rather than effects of the data (harsh words).

A lot more work needs to be done on this but in closing these comments the peninsula stations show an unreasonably high trend, IMO this must be an error. Something has created an upslope in temps over a long term which cannot reasonably exist next to the adjacent cooling stations on several sides. As we all know I’ve been wrong before but that’s what I think tonight.

In summary, I’ve used the complete satellite data for spatially weighting the surface stations across the antarctic. The method used, eliminates the problem of negative correlation and instead the negatively correlated surface station has zero effect. The drawback of this method is that surface station data which has no concurrent values to sat data cannot be used resulting in the elimination of some surface station data.