the Air Vent

Because the world needs another opinion

RegEM weight calculation – a different approach

Posted by Jeff Id on June 26, 2009

This is a guest post by NicL  who has been working on Antarctic station weighting in the background.  His method uses a brute force method for calculating trend sensitivity on the Antarctic reconstruciton.  After discovering that the B matrix in TTLS created a different weighting for each set of years that had different active surface station data, creating a solution which gives weighting becomes more complex.

Nic found less weighting for the peninsula and more for the ross ice shelf which is more in line with my original suspicion.  However, my own impression is that by area the peninsula is still heavily weighted and several stations are negatively weighted.

I haven’t checked the calculations myself but the method is straightforward although time consuming.  I hope you like it.


Jeff Id and Ryan O have done a lot of good work on calculating the effective weights of the different surface stations in Steig’s main satellite AVHRR data based reconstruction.  I would like to present the results of a slightly different approach to this issue.

As Jeff Id has pointed out (Improved Weight Calculation, Air Vent, 15 June 2009), the B matrices used in TTLS RegEM vary across time.  That is because the set of input series having actual values varies with time: the B matrices are identical for different times that have the same set of missing values.  Obviously, whenever an input series has no actual value it has a zero weight in the reconstruction values for that month.  However, I am not sure that fluctuations over time in the weight of individual surface stations is of particular relevance.

To my mind, the key question is what weight the temperature trends in each surface station have in the overall temperature trend of the reconstruction – both Steig’s reported (spliced) trend based on the actual satellite data post 1981, and the trend implicit in the “unspliced solution”, being that using the three principal components (PCs) implicit in the missing values imputed by the reconstruction.  I suspect that time-averaging the weights derived from the B matrices would in particular not give accurate weights for the unspliced solution, since the B matrices only give weights for the missing data points that are imputed.  Moreover, actual data from one station may, through its effect on the covariance matrix, affect the weights given in the B matrices to data from other stations in months where the first station does not have actual reading.  Therefore, I don’t think that the weights in the B matrices can be relied on to give an accurate measure of the effect of the data from a station on the overall reconstruction.

I could see no way of determining the weighting problem analytically, as the covariance matrix used by RegEM to calculate the model solution itself depends on values imputed by RegEM in the previous iteration (hence RegEM’s iterative nature).  I know that Steve McIntyre wrote at Climate Audit, in his 2 June post “TTLS in a Steig Context”, that despite the PCA and truncation the result of TTLS for the pre-1982 period is still a linear recombination of station data: “Because the TTLS relationship can be reduced to a matrix of coefficients B{TTLS}, the ultimate Antarctic temperature index [for the pre 1982 period] proves to be a linear weighting of the underlying station histories”.  However, whilst this statement is true if one takes the B matrices as being fixed, I think that it does not take into account the fact that the B matrices are derived from the covariance matrix for the entire 1957-2006 data set, including the satellite PCs, and so are affected by the post 1981 satellite data.  The position becomes more complicated after RegEM is iteratively applied, as the post 1981 satellite data then also impacts the iterated covariance matrix (which is calculated using imputed as well as actual data) via its effect in imputing missing temperature data.  Accordingly, the pre-1982 RegEM solution is affected by post 1982 data, including the satellite PCs. Hence the weightings to be calculated should in principle include the contribution from the satellite data as well as from the ground stations.

My first attempt at answering the question, of what weight temperature trends at each surface station have in the overall temperature trend of the reconstruction, involved adding a small linear trend to each surface station temperature anomaly series in turn, one at a time, and measuring in each case what was the resulting change in the two measures (spliced and unspliced) of the overall reconstruction temperature trends.  Dividing the change in overall reconstruction trend by the linear trend added to the station data gives the weight of the station in the overall reconstruction trend. I chose to add a trend of 0.2 C /decade, centered on the middle of the 50 year period.  I did not want to add too large a linear trend, which might have a large effect on many of the imputed values and hence on the covariance matrix, in case doing so gave a result that was not a linear extrapolation of a small trend.  An alternative would be to detrend each data series in turn: I turn to that later.  Doing so complicates the calculations slightly, and would not work satisfactorily for stations with very small trends, but for stations with strong temperature trends it should produce a more accurate measure of what effect their actual trends had on the reconstruction in the event that the effect of surface station trends on the overall reconstruction trend is non-linear.

Note that the station weight being measured is affected not only by how temperatures at the station correlate with those elsewhere and the satellite data but also by how long the station temperature record is and how much data is missing.  A strong temperature trend at a station like Asuka with a very short record would have a very small effect on the overall temperature trend of the reconstruction even if the station temperature had average correlations with the other data, so such a station would be expected to have a very low weight.

I was aware that, in order to produce reasonably accurate results, my approach would require running RegEM over 40 times with a small tolerance (I chose tol = 0.001) and hence a large number of iterations.  That would take a long time.  In order to speed things up a bit, I modified Steve McIntyre’s R-port of Tapio Schneider’s RegEM PTTLS so that it only called the pttls subroutine when the set of stations with missing data was different from that for the previous month (or the month before that, to allow for the possibly common occurrence of stations missing data for isolated months).  Having brushed up on my (pretty poor) linear algebra  and learnt to some degree how exactly the regem_pttls function worked, I realised that I could also modify the function so as to directly generate reconstruction PC time series and the (unchanging) set of weights that are applied to the PCs in order to give the imputed values for each series (the unspliced solution).  Doing so is in my view preferable to back solving the PCs and weights from the imputed values in the RegEM output.

The key to calculating the PCs is to observe that RegEM, having each iteration calculated the singular value decomposition of the covariance matrix of the data (actual and imputed, scaled to unit standard deviation), in each month breaks the matrix V of right singular vectors into sub-matrices containing the rows for series respectively with and without actual data for that month (V11 and V21), also discarding all but the first three (=regpar) columns in each case.  Although the vectors included in matrices V11 and V21 vary between months, the vectors themselves are the same for all months.

The pttls solution matrix B for each month is then, in R terminology, given by:

Xr = V11 %*% solve((t(V11) %*% V11) %*% t(V21)

and the imputed values for that months missing data, before rescaling, are given by Xmis = X(actuals) %*% Xr.

This can be broken down into the form of (monthly PCs) %*% (unchanging weights) as:

Xmis = X(actuals for that month) %*% V11 %*% solve((t(V11) %*% V11) [being a 600 x 3 orthogonal matrix of PCs]  %*%  t(V21) [a 3 x k matrix of weights for the k missing values for the month concerned]

and by extension (as V21 is an extract of svd(correlation matrix of X)$v), and after rescaling:

Xsolution = PCs %*% wts

where wts = t(svd(correlation matrix of X)$v[,regpar]), rescaled [a 3 x 45 matrix of weights, being the weights applied to the PCs of the solution to get imputed values for all the data series].  Note that these solution PCs are different from the reconstructed pre-1982 satellite PCs (which are each composed of a linear combination of the solution PCs).   Using lm on the solution PCs confirms that they are orthogonal to each other.

I ran my modified RegEM function, with regpar=3 and maxiter=100, first on the surface station and satellite PC data as per Steig’s reconstruction, and then on data with one series at a time having a linear trend added to it.  The satellite data is represented by three PCs but most all of the average temperature trend is reflected through PC1.  Adding a linear trend to each PC series separately ought to get the overall weighting of the satellite temperature trends in the reconstruction trends correct, but would overstate the effective weighting of satellite data.

I therefore instead added the linear trend to the satellite data itself and took the resulting changes in the two reconstruction temperature trends (spliced and unspliced) as the measure of the total satellite weighting.  In the table below, the satellite average temperature trend as conveyed by the first three PCs is allocated between them according to their contributions thereto, but the overall satellite weighting is not allocated between the three PCs (and is shown just on the PC1 row).  The satellite average temperature trend of 0.261 C/decade is not fully contained in the first three PCs, which between them account for a trend of 0.245.

Running these computations took about two hours on my (rather slow) machine.  The modified version of the TTLS RegEM function and the script used to generate the results are given at the end of this post.   My script assumes that one has already run Ryan O’s very impressive and comprehensive reconstruction and analysis script, from which it borrows some elements.

The results of my computations are shown in Table 1 below: I am afraid that I have not yet mastered any but the simplest graphs in R .   All trends are for the 1957-2006 continent wide average, and are given in degrees C per decade.  The weights (or sensitivities) are the absolute changes in the relevant reconstruction trend divided by the linear trends added to the station data, and have been multiplied by 100 in order to make them easier to read. They are not percentages that are scaled to add up to 100.  In an ideal linear relationships world, and with no interaction effects between data series, one might expect the weights to sum to unity and that summing the products of weights and trend across all surface stations and the satellite PCs would produce the overall reconstruction trend.  In practice, minor inaccuracies in the iterated RegEM output, and possible non-linearities and/or interaction effects between the station data may be expected to result in the weights not summing to unity, and in their products with the station trends not summing to the overall reconstruction trend.

Table 1

Table 1

Table 1
Various items in Table 1 are worth noting:

1. The station weights are significantly different from those arrived at by methods that only take direct effects into account.  The peninsula (including island) stations have under a third of the weighting of East Antarctica, but due to the far higher temperature trend in the peninsula the stations there contribute nearly twice as much to the overall trend as do the stations in East Antarctica.  However, their contribution to the trend – approximately 32% for the Steigian, spliced reconstruction and 40% for the unspliced solution – is still well under 50%.

2. The contribution from the two West Antarctica stations is under 1%.  Incidentally, Byrd appears to be marked as a Ross Ice shelf station in Ryan O’s reconstruction script; I have corrected this here.

3. The three Ross Ice Shelf stations have a surprisingly high aggregate weighting, of about 16% of the total.  This is almost entirely due to Scott_Base and McMurdo, and as these stations have reasonably high temperature trends the region contributes over 20% of the overall reconstruction trend.  Both the Scott_Base and McMurdo data series, even if detrended, have correlations with the satellite PCs that, although modest, are statistically very significant.  However, as the Ross ice shelf is a fairly small area (and these two stations are close together) it seems hard to justify it having such a large weighting in the overall reconstruction trend.

4. The satellite data itself makes a major contribution to the overall trend, but its contribution to the unspliced solution (some 12%) is much less than to Steig’s spliced reconstruction (some 29%).  The weight attributable to the satellite data is only about half its contribution to the overall trend, reflecting the fact that the average temperature trend implicit in the three satellite PCs is about double that of the reconstruction.  It is interesting that the satellite data contribution to the overall trend is much less than 50%, given that the second 25 years of Steig’s reconstruction simply consists of the satellite data itself,.

5. Stations with data in one period can affect the reconstruction trend in a different period, confirming what I indicated earlier.  For example, Ferraz and Great Wall only have post 1981 data, yet adding a trend to their data affects (albeit to a minor extent) the spliced reconstruction, necessarily in the pre 1982 period as all the post 1981 values of the spliced reconstruction are fixed.

6.  The only stations with a non-negligible negative weight are the three western island stations: Grytviken, Orcadas and Signy.  I know negative weights worry people, but there do appear to be negative correlations between temperatures changes on these islands and temperature changes on the Antarctic continent implied by the satellite PCs.  Accordingly, negative weights are the logical outcome of applying a procedure that gives more weight to short term fluctuations than to long term trends.

7.  The sum of the surface station and satellite data weights is fairly close to unity – about 3% higher.

8. The sum over surface stations (and satellite PCs) of contributions to the overall reconstruction trend, being for each data series its weight multiplied by its temperature trend, is significantly greater than the actual overall reconstruction trend:  by 0.018 (0.137 vs 0.119) for the spliced reconstruction, and by 0.022 (0.124 vs 0.102) for the unspliced solution.  Some or all of this excess could be due to the effective weightings varying with time depending on which other stations have missing data, and to the changes in station temperatures not being linear.   However, it could also be due to there being non-linearities in the response of the RegEM output trend to trends in the input series and/or to RegEM introducing inter-station interaction effects.

Further investigations suggested that the response of the RegEM reconstruction average trend to temperature trends in the input data was indeed non-linear.   Therefore, I repeated the above work but with each of the input series being detrended in turn, rather than having  a fixed trend added to it.  Where the trend in a series was under 0.1 C/decade in absolute value, so that the detrended result could have been inaccurate due to residual errors in the RegEM outputs, I instead added a trend of 0.1.  As the satellite data impacts the RegEM results only through the three PCs, rather than detrending each of the satellite grid cell temperature series I instead detrended each of the satellite PCs.  For presentation purposes, I show the satellite PCs weight entirely against PC1 but spread the resulting impact on the RegEM output trends between the three PCs according to the proportion of the average grid cell temperature trend that each conveys.

I also changed the surface station data set from the one used in the above analysis (which I had employed Ryan O’s reconstructions script to download from the BAS website a few weeks ago) to the one archived at Climate Audit, which is much closer to the data used by Steig.  Doing so reduced the base case spliced trend from 0.1193 to 0.1172 C/decade, close to Steig’s result of 0.1178, and the unspliced trend from 0.1023 to 0.1007.

The results of the revised approach are shown in Table 2, below.

Table 2

Table 2

The changes between the Table 1 and Table 2 results are due almost entirely to the use of the detrending method rather than to using an earlier data set, apart from the contribution to the trend from a few of the stations in East Antarctica that showed slightly higher trends in the earlier data set.

Using the detrending method, instead of adding a 0.2 C/ decade linear trend, results in a smaller excess of the sum of the calculated contributions to the overall trend over each base case trend, of 0.006 (0.123 vs 0.117) for the spliced trend and of 0.010 (0.110 vs 0.100) for the unspliced solution trend.  The weights now sum to significantly less than unity, particularly for the unspliced solution.

Many of the surface station weightings as measured by the detrending method are significantly different from those as measured by adding a fixed 0.2 C/decade trend.  This implies that the response of the RegEM overall reconstruction trend to changes in the trend of individual data series can be significantly non-linear.

The total weighting of  the peninsula and island stations in Table 2 is halved as compared to the Table 1 results.  The reduction in their contribution to the overall trend is proportionately much less, and their contribution remains substantial at 25% to the spliced trend and 32% to the unspliced solution trend.  It is unclear why the weightings of the peninsula and, especially, the island stations are affected so much by whether they are determined by detrending or by adding a fixed trend.

Table 2 shows the effect of detrending the satellite data to be a reduction in the spliced overall trend of  0.0385 C/ decade, which is in line with Steig’s result for the reduction in trend when using detrended satellite data.

Finally, I checked the effect of detrending all 45 data series.  With none of its input series having any trend, I expected RegEM’s output to have a zero trend.  However, that does not appear to be the case.  The resulting residual trend (using the surface station data archived at Climate Audit) was +0.0055 C/decade for the spliced output and +0.0042 for the unspliced solution.  Small, but much too large to be due to RegEM not having fully converged: increasing the number of iterations had a negligible effect on the residual trend.  I am not certain what this result is caused by – possibly it is due to some of the data series which have short term correlations with the satellite data having a rising trend over 1957-1981 and a falling trend over 1982-2006.

In conclusion, using my methods the main findings about the average Antarctica trend over 1957-2006 produced by RegEM for the main surface station plus satellite AVHRR data reconstruction are as follows:

 The influence of temperature trends experienced by the peninsula and island stations on the reconstruction spliced overall trend, whilst substantial, is significantly less than generally thought.

 The influence of the Ross ice shelf stations is surprisingly large; it is about 10% less than that of the peninsula and island stations but about 10% more than that of East Antarctica.  The influence of West Antarctica is negligible at under 1%.

 Based on the detrending method, the influence of the Satellite data on the overall reconstruction spliced trend is larger than the total influence of the surface stations in any one region of Antarctica.

 The response of the overall reconstruction trend to trends in individual input data series is in many cases non-linear, in some cases highly so.

 The reconstruction trend remains significantly positive, although small, even when all the input data series are completely detrended.

Nic L
June 22, 2009


Modified version of  TTLS RegEM function

##Nic L’s modified version of Steve McIntyre’s port of Schneider PTTLS RegEM to R
## Modifications are to speed up execution by reducing the number of calls to the pttls routine and to output ## the unspliced solution in the form of Principal Components (Xpcs) & weights to be applied to them (wts)

pttls.n= function ( V, d, colA, colB, r ) { #d here is eigenvalue NOT squared eigenvalue (Schneider)
(na=nrow(V)) #23
(ma  = ncol(V)) #23        #  ma must exceed nd
(nd   = length(d)) ; #23
n       = length(colA);           # 23 number of columns of A (number of variables)
k       = length(colB);           # 82 number of right-hand sides
nr = length(r); #1: nr appears to be only 1 in relevant cases

# initialize output variables
Xr = array(0, dim=c(n,k, nr));dim(Xr) # 23 82 1  #assume k = 1
Sr = array(0,dim=c(k, k,nr));dim(Sr)  # 82 82 1  #assume k =1  3D in Schneider
rho = array(0,dim=c(nr,1)); dim(rho) #  1 1 if (nargout > 2)
eta = array(0,dim=c(nr,1)); dim(eta)# 1 1 if (nargout==4)

# compute a separate solution for each r # extra since only one r
for (ir in 1:nr)  { #nr
rc          = r[ir]; #3
V11         = V[colA, 1:rc];# 23 3
if(!(length(colB)==1)) V21 = as.matrix(V[colB, 1:rc]) else V21 = t(V[colB, 1:rc]);  # 82 3
sol = V11 %*% solve( t(V11) %*% V11)
Xr[,,ir]  =  sol %*% t(V21)  # dim(Xr[,,ir]) #23 82

#  estimated covariance matrix of residuals dB0’*dB0 (up to a scaling factor)
if(!(length(colB)==1)) V22 = as.matrix(V[colB, (rc+1):na]) else V22 =  t(V[colB, (rc+1):na]) ;  # 82 3
#V22         = V[colB, (rc+1):nd]   #102 82
Sr[,,ir]  = V22 %*% diag( d[(rc+1):nd]^2) %*% t(V22)  # 1,]    0    0 4.146356    0    0

# alternative formula when all left singular vectors are given (not used)
rho[ir]   = sqrt(sum(d[(rc+1):nd]^2)); #  8.4563     # residual norm = norm( [dA0 dB0], ‘fro’)
eta[ir]   = sqrt( ssq(Xr[,,ir]) ) #  0.5372863
#implements eta[ir]   = norm(Xr[,ir], ‘fro’); # solution norm = norm( Xr, ‘fro’)
}  # ir loop

}  #function

regem_pttls.n=function(X,maxiter=50,tol=.005,regpar=3,method=”default”,startmethod=”zero”) {

## initial infill

for( j in 1:n) { kavlr[[j]]= (1:p)[!indmis[j,]];
kmisr[[j]]= (1:p)[indmis[j,]]}
#this sets up missing values on line by line basis

(dofC=n-1) #599
(trunc=min( regpar,n-1,p)) #3
(dofS=dofC-trunc) #596

if(startmethod==”zero”) {
X=scale(X,scale=FALSE)  #center first
M= attributes(X)$”scaled:center”; #M[1:5]
X[indmis]=0 #zero  page 6
} #startmethod

if(startmethod==”average”) {
X=scale(X,scale=FALSE) #center first
M= attributes(X)$”scaled:center”; #M[1:5]
X[indmis]= array( rep(m0,p),dim=c(n,p)) [indmis]
} #startmethod


while (iter <=maxiter &  dXmis>tol) {
C=t(X)%*%X/dofC   #this is the same as cov(X)
D=sqrt(diag(C)) ;# also sd
X=X%*%diag(1/D) #standardize to sd=1
C=diag(1/D)%*%C%*% diag(1/D)


Xmis=array(0,dim=c(n,p)) #placeholders

for(j in 1:n) {
if ((j>1)&&identical(kavlr[[j]],kavlr[[j-1]])) {B[[j]]=B[[j-1]]; S[[j]]=S[[j-1]]; sol[[j]]=sol[[j-1]]} else {
if((j>2)&&identical(kavlr[[j]],kavlr[[j-2]])) {B[[j]]=B[[j-2]]; S[[j]]=S[[j-2]]; sol[[j]]=sol[[j-2]]} else {test=pttls.n(svd.ant$v,svd.ant$d[1:neigs],colA=kavlr[[j]],colB=kmisr[[j]],r=regpar);
B[[j]]=test$Xr[,,1]; S[[j]]=test$Sr[,,1]; sol[[j]]=test$sol}
row.names(B[[j]])=kavlr[[j]]; dimnames(B[[j]])[[2]]=kmisr[[j]]
Xerr[j,kmisr[[j]] ]=dofC/dofS  *  t (sqrt(diag(S[[j]])))
Xmis[j,kmisr[[j]] ]= X[j,kavlr[[j]]] %*%B[[j]]
Xpcs[j,]= X[j,kavlr[[j]]] %*%sol[[j]]
CovRes[ kmisr[[j]] , kmisr[[j]] ]= CovRes[ kmisr[[j]] , kmisr[[j]] ]+S[[j]]

##rescale to original scaling
# X first step

# Xmis

# Xerr

# C, CovRes
C=diag(D)%*%C%*% diag(D)    #rescale to covariance
CovRes=diag(D)%*% CovRes %*% diag(D) #rescale

# dXmis
dXmis=sqrt( ssq( Xmis[indmis]-X[indmis]))/sqrt(nmis)

# X second part
X[indmis]=Xmis[indmis] #update data
X=scale(X,scale=FALSE) # #recenter
M=M+Mup ; #round(M,3)[1:7]#updated mean vector

# C second part
C=t(X)%*%X/dofC   #cov(X)
D=sqrt(diag(C)); wts=t(svd(diag(1/D)%*%C%*%diag(1/D))$v[,1:regpar]*D)

#summarize iteration comparisons
if(method==”verbose”) regem[[iter]]=list(X=X,C=C,B=B,S=S,dXmis=dXmis) else regem[[iter]]=list(X=X,dXmis=dXmis)
} #iter
regem[[N]]$B=B; regem[[N]]$Xmis=Xmis; regem[[N]]$M=M; regem[[N]]$C= C; regem[[N]]$Xpcs=Xpcs; regem[[N]]$wts=wts
} #function

Script to generate the data in Tables 1 & 2 and referred to elsewhere in the text

#load and prepare data using  functions and data objects from Ryan O’s reconstruction
dat=window(calc.anom(all.avhrr), start=c(1982), end=c(2006, 12))
base=window(, form=clip), start=1957, end=c(2006, 12))
pcs.sat=ts(pcasat$x[,1:3],start=1982, frequency=12)

statmask[17]=FALSE #rm gough
statmask[26]=FALSE #rm marion
stationIDX[8,10]=2 # correct Byrd to being West Antarctica


#compute the base reconstruction and derive the spliced and unspliced average trends
tr2=lm(s3n[[it]]$Xpcs %*% s3n[[it]]$wts[,43:45] %*% av.wts*10 ~ time(dd))$coefficients[2]

# create matrix of data series with fixed 0.2 C/decade trend added (or of equivalent size in terms of the underlying averaged data for satellite PCs)
tr=seq(length=600, by=1/600, from=-299.5/600)

#compute the effects of adding the trend to each surface station, one by one
for (j in 1:42) {
wt[1,j]=(lm(s3n[[it]]$X[,43:45] %*% av.wts*10 ~ time(dd))$coefficients[2]-tr1)/0.2 #change in spliced trend
wt[2,j]=(lm(s3n[[it]]$Xpcs %*% s3n[[it]]$wts[,43:45] %*% av.wts*10 ~ time(dd))$coefficients[2]-tr2)/0.2  #change in unspliced trend

#compute the effects of adding the trend to the satellite data
pcs.sat2=ts(pcasat2$x[,1:3],start=1982, frequency=12)
wt[1,43]=(lm(s3n[[it]]$X[,43:45] %*% av.wts2*10 ~ time(dd))$coefficients[2]-tr1)/0.2  #change in spliced trend
wt[2,43]=(lm(s3n[[it]]$Xpcs %*% s3n[[it]]$wts[,43:45] %*% av.wts2*10 ~ time(dd))$coefficients[2]-tr2)/0.2  #change in unspliced trend

#compute surface station and satellite PC temperature trends
for (j in 1:45) { wt[3,j]=lm(dd[,j]*10 ~ time(dd))$coefficients[2] }

wt[5,]=stationtype; #wt[4,]=c(substring(colnames(base),10),”Sat PC1″,”Sat PC2″,”Sat PC3″)

#Repeat the analysis but using surface station data from Climate Audit and Steve M’s anom function
#and also detrending the temperature trend for each data series (or adding 0.1 C/decade trend if absolute actual trend is <0.1, to avoid rounding errors)
#load and prepare data
#download.file(“;,””,mode=”wb”); #uncomment if not already downloaded
surf=window(surf,start=1957,end=c(2006,12)) #ends in 2006 per Steig
surf = surf[,-26] #deletes column 26 – Marion
surf = surf[,-17] #deletes column 17 – Gough
for (i in 1:42) ss_ca.o[,i]=anom(surf[,i],reference=1957:2006)

#compute the base reconstruction and derive the spliced and unspliced average trends
tr2c=lm(s3n[[it]]$Xpcs %*% s3n[[it]]$wts[,43:45] %*% av.wts*10 ~ time(ddc))$coefficients[2]

#recompute surface station, and copy satellite PC, temperature trends and add station identity info
for (j in 1:45) { wtc[3,j]=lm(ddc[,j]*10 ~ time(ddc))$coefficients[2] }

wtc[5,]=stationtype; #wtc[4,]=c(substring(colnames(base),10),”Sat PC1″,”Sat PC2″,”Sat PC3″)

#compute the effects of detrending (or adding a 0.1 C/decade trend, if greater) to each surface station, one by one
for (j in 1:42) {
if (abs(wtc[3,j])<0.1) {xin[,j]=ddc[,j]+seq(length=600, from=-149.75/600, to=149.75/600);delta=0.1} else {xin[,j]=ddc[,j]-wtc[3,j]/120*seq(length=600, from=-299.5, to=299.5);delta=-wtc[3,j]}
wtc[1,j]=(lm(s3n[[it]]$X[,43:45] %*% av.wts*10 ~ time(dd))$coefficients[2]-tr1c)/delta #change in spliced trend
wtc[2,j]=(lm(s3n[[it]]$Xpcs %*% s3n[[it]]$wts[,43:45] %*% av.wts*10 ~ time(dd))$coefficients[2]-tr2c)/delta  #change in unspliced trend

#compute the effects of detrending the satellite data (at PC level since it is the PCs that interact with the other data series)
tr.pc[1]=lm(pcs.sat[,1]*av.wts[1]*120 ~ seq(1:300))$coefficients[2]
tr.pc[2]=lm(pcs.sat[,2]*av.wts[2]*120 ~ seq(1:300))$coefficients[2]
tr.pc[3]=lm(pcs.sat[,3]*av.wts[3]*120 ~ seq(1:300))$coefficients[2]
xin[301:600,43]=ddc[301:600,43] – tr.pc[1]/av.wts[1]/120*seq(length=300, from=-149.5, to=149.5)
xin[301:600,44]=ddc[301:600,44] – tr.pc[2]/av.wts[2]/120*seq(length=300, from=-149.5, to=149.5)
xin[301:600,45]=ddc[301:600,45] – tr.pc[3]/av.wts[3]/120*seq(length=300, from=-149.5, to=149.5)
wtc[1,43]=(lm(s3n[[it]]$X[,43:45] %*% av.wts*10 ~ time(dd))$coefficients[2]-tr1c)/-sum(wtc[3,43:45])
wtc[2,43]=(lm(s3n[[it]]$Xpcs %*% s3n[[it]]$wts[,43:45] %*% av.wts*10 ~ time(dd))$coefficients[2]-tr2c)/-sum(wtc[3,43:45])

# check if detrending all data series produces zero RegEM output trends
for (j in 1:42){xin[,j]=ddc[,j]-wtc[3,j]/120*seq(length=600, from=-299.5, to=299.5) }
lm(s3n[[it]]$X[,43:45] %*% av.wts*10 ~ time(dd))$coefficients[2]  # 0.005455
lm(s3n[[it]]$Xpcs %*% s3n[[it]]$wts[,43:45] %*% av.wts*10 ~ time(dd))$coefficients[2]  #  0.004190027

49 Responses to “RegEM weight calculation – a different approach”

  1. Kenneth Fritsch said

    NicL, thanks for all your efforts in putting together and posting your analysis. If I may, I will give my layman’s view of what you have concluded here by your methods and alternative methods. You derived a weighting of the 4 areas of Antarctica and the Satellite AVHRR PCs in the Steig reconstruction. The weighting is in effect a factor which when taken times the area (satellite) trend of that area (satellite) yields the temperature anomaly trend for the Antarctica as a whole.

    You make a point of noting that these portions do not weight the Peninsula as heavily as some previous analyses did. I assume from my understanding of your analysis that the weightings are independent of area and PC anomaly temperature trends. I guess the next logical step would be to compare the weightings that you derived for the areas (by leaving out the satellite PCs) and comparing those weights to the weights calculated by geographical area included in each area and then explaining why those weightings that you derived from Steig and calculated by area should be different.

  2. Kenneth Fritsch said

    More correctly from above should have been:

    The weighting is in effect a factor which when taken times the area (satellite) trend of that area (satellite) yields the temperature anomaly trend for the Antarctica as a whole -when all the area (satellite) products are added together.

  3. Nic L said


    1 & 2: Thanks for your comments.

    Your definition is I think what the weightings would represent if RegEM worked in a straightforward way. As I noted, in practice adding all the products of weighting and trend gives a slightly higher result than temperature anomaly trend for Antarctica as a whole. That is so even when non-linearities are allowed for by estimating weighting by completely detrending each data series (as per Table 2), rather than by adjusting its trend by a fixed amount (as per Table 1). Per Table 2, the sum of products and actual trends (C/decade ) are 0.1230 vs 0.1172 (excess 0.0058) for the spliced (Steigian) result and 0.1084 vs 0.1007 (excess 0.0077) for the unspliced solution (these latter figures correct small errors in those given in the post).

    The non-zero anomaly trend for Antarctica as a whole that RegEM gives even when all the surface station and satellite data is detrended (0.0055 for the spliced result and 0.0042 for the unspliced solution) appears to account for much of the discrepancy. The remainder is probably due to the trends in the data from different areas interacting within the RegEM algorithm. There do not appear to be any significant interaction effects of trends in different surface stations in the same area. For all areas, I found that simultaneously detrending all surface station data in the area produced a change in anomaly trend for the Antarctica as a whole that matched pretty closely the sum of the weighting x trend products for individual surface stations in that area.

    The area (and satellite) weightings are in principle independent of the anomaly temperature trends. However, in practice I believe that the RegEM result is affected by temperature trends, which make are one of the factors that determine the covariance matrix used by RegEM. I suspect that the high temperature trends exhibited by two of the three Ross ice shelf stations may be a reason for the high weighting of that area. The same may well be true of the Peninsula, although the large number of stations in that area is almost certainly a factor as well. Other things being equal, the more stations an area has the higher it will be weighted by RegEM. West Antartica has a low weighting despite a high average temperature trend, but the two stations there only have short records and so have relatively little influence on the RegEM result.

    I think it is fairly clear that the Ross ice shelf area is overweighted in Steig’s result relative to its geographical area, maybe for the reason given above. West Antartica is underweighted in Steig’s result relative to its geographical area (although not as much as appears from the 0.15% weight in Table 2, as the temperatures in each area also affect the satellite data, which has a 15.7% weighting). I think this is due to the lack of surface station data there. The peninsula and islands are probably overweighted relative to their geographical area (which I don’t have a figure for), but that may depend on how one treats the islands.

  4. I’m obviously aware of the iterative nature of RegEM and made specific note of it. My point was different. I was drawing attention to the “near” block matrix structure of the situation and pointing towards an interpretation of the iterations as perturbations of the block matrix structure. My own sense is that not much is accomplished by the perturbation and that much of the RegEM process is sound and fury signifying nothing. The key things are latent in the block matrix situation.

    In addition, the statistical meaning of revising the B-matrix for perturbation information is not a gimme. I’ve recommended Brown and Sundberg 1989 on this question; I believe that the conditions here have much in common with the discussion in Brown and Sundberg 1989 as to “updating” the B-matrix. However, Brown and Sundberg 1989’s heavy going.

  5. Kenneth Fritsch said

    I think it is fairly clear that the Ross ice shelf area is overweighted in Steig’s result relative to its geographical area, maybe for the reason given above. West Antartica is underweighted in Steig’s result relative to its geographical area (although not as much as appears from the 0.15% weight in Table 2, as the temperatures in each area also affect the satellite data, which has a 15.7% weighting). I think this is due to the lack of surface station data there. The peninsula and islands are probably overweighted relative to their geographical area (which I don’t have a figure for), but that may depend on how one treats the islands.

    Nic L, when I did an area calculation a while back I divided the Antarctica into 3 areas starting by East separated from West at 180/0 longitude and then the Peninsula from the West by the area between 285 to 330 longitude and latitude greater than -76. That gave the area percentages as: East Antarctica
    = 65%, West Anarctica = 31% and the Peninsula = 4%. I would think that the satellite data should be apportioned by area to the separate Antarctica regions.

    I am seriously beginning to think that using a reconstruction method beyond the simpliest and easily understood method should require the user to explain in detail why that method was required and what are its advantages – other than to be able to say it is better than the back of envelope calculations of past times.

    Also does your analysis allow any insights into the authors of Steig admitting the obvious problem that the reconstruction gives cooler temperature anomaly trends than have been measured in the Peninsula.

  6. Nic L said


    4. Noted, thank you. My point about your statement was really that one of the reasons that the weights given by the B matrices might not be an accurate guide to the influence of each surface station on the overall reconstruction was that each B matrix was affected, in a difficult to analyse way, by all the data input to RegEM (including, in the pre 1982 period, by the post 1981 satellite data).

    I think that I broadly understand what you are saying about the block matrix structure, and that RegEM’s iterations can be interpreted as perturbing the block matrix.

    Am I right in thinking that if much of the RegEM process is sound and fury, then there should be relatively little change in the B matrices between the first and last iterations? From a brief examination of the B matrices (for times in early 1957 and late 2006), there are actually substantial changes in those matrices as RegEM iteratively progresses.

    I will try to study Brown and Sundberg 1989 in more detail as you recommend – it certainly is heavy going.

  7. Nic L said


    “I divided the Antarctica into 3 areas starting by East separated from West at 180/0 longitude and then the Peninsula from the West by the area between 285 to 330 longitude and latitude greater than -76. That gave the area percentages as: East Antarctica = 65%, West Anarctica = 31% and the Peninsula = 4%.”

    I don’t think that longitude 180/0, whilst logical, is the usual method of separating East from West Antarctica. Steig uses dividing longitudes of 180/60W. That gives a higher area weighting for East, and a lower area weighting for West, Antarctica. I am not sure how one should measure the Ross ice shelf and apportion it between East and West Antartica: all three surface stations that are marked as Ross ice shelf are east of 180 longitude but the ice shelf straddles 180 longitude.

    I agree with your 4% figure for the peninsula based on your –76 latitude cut off and the number of satellite AVHRR data cells, but Steig defines it using –72 as the cut off (and also includes all westerly latitudes, which appears to mean that his definition includes Neumayer, which is nowhere near the peninsula). This all leaves out the islands, which I think most people agree should not form part of any reconstruction of temperatures on the Antarctic continent.

    “I would think that the satellite data should be apportioned by area to the separate Antarctica regions.”

    I agree.

    I think that Steig’s choice of just three principal components to represent the satellite data (and hence the reconstruction for the whole period and each sub-period) means that temperature trends in the peninsula cannot be accurately reflected in the reconstruction.

  8. Nic L said

    7. I obviously haven’t quite got the hang of WordPress blockquote cites yet! 🙂

  9. Mark T said

    It behaves oddly if you have more opens than closes, and stray gt or lt symbols screw things up, too. It’s not really WordPress so much as it is xhtml.


  10. Nic L said

    Further re 7. On Steig’s definition of the penisnsula, but excluding points with longitudes 345-360 that are nowhere near the actual peninsula, it makes up 2.6% of the satellite AVHRR data cells. On that basis, the 7.9% weighting that I calculated the peninsula is given in the (spliced) RegEM reconstruction is three times its area weighting. A back of the envelope calculation indicates that the excess weighting of the peninsula pushes up the reconstruction temperature trend by approximately 0.015 C/decade, which is a significant fraction of the trend. That is ignoring the fact that the pensinsula also affects the reconstruction temperature trend as it contributes positively to the temperature trend in the satellite data, which has a 15.7% weighting.

  11. Jeff Id said


    I’m not sure why you used a detrending one at a time, but when you do it that way you are adding a different signal into each station.

    I much prefer the first table which you added 0.2 but why not add 0.002 or something small? This would perturb the matrix less.

  12. Jeff Id said


    I don’t believe the 7% result is very close. I’ve been thinking and the back calculation shows a 25% contribution when the station ID’s are correct. The B matrix shows 50%, I understand the concerns but the detrending of individual stations doesn’t make sense to me. Maybe I’m missing something. Also the increasing of trend by 0.2 is pretty severe, perhaps enough to cause a deweighting because of it’s extreme status.

    I like the approach but the detail has me wondering how close this estimate is.

  13. Nic L said


    11 & 12. I’m not sure that I understand what aspect of the procedure of detrending individual stations you take issue with. Detrending an individual station does indeed add a different signal into all the other stations. It thereby captures the indirect effects of that station’s signal, which operate through that station’s data perturbing the B matrices for every month, thereby altering the imputed data for other stations even in months when that station has no actual data.

    I can see that detrending on a station by station basis would give an inaccurate measure of the combined weight of the stations in an area, or in Antarctica as a whole, if interaction effects between stations were significant. It is In order to address the interaction issue for areas, I also tried detrending stations area by area, which gave very similar results to detrending station by station and summing over stations in each area, although there was a modest difference when the peninsula as a whole was detrended – the effect of which on the reconstruction average trend was about 0.002 less than the sum of the individual station effects (0.0285 vs 0.0308 for the spliced reconstruction).

    I agree with your view that the 0.2 trend that I added in Table 1 was rather large – I was worried that the RegEM convergence tolerance might lead to inaccurate results if a very small change in trend was used. I found that this was a problem adding a trend of 0.002, but I am now re-running the script adding a fixed trend of 0.02 to each station (using the station data archived at Climate Audit) and reducing the RegEM convergence tolerance to avoid inaccuracy. I will report the results when obtained. They should provide a good measure of the marginal sensitivity, allowing for indirect effects, of the average reconstruction trend to changes in trend in individual stations from their actual level. If the response of the average reconstruction trend to trends in individual stations were linear, it would also provide a good measure of the contribution of each station’s trend to the average reconstruction trend. Unfortunately, that response seems to be non-linear, which is why it is necessary to completely detrend a station’s data in order to measure the full contribution of that station’s data to the average reconstruction trend.

    Although my estimate of the peninsula and island weighting using the detrending method is fairly low at circa 8.5% of the total weightings of all data series, it is still well over 3x the area weighting of the peninsula. Further, because of the very high temperature trend of the peninsula its contribution to pushing up the reconstruction average temperature trend is substantial.

  14. Jeff Id said

    Nic, What I think happens is that each station is detrended by a different amount. Some stations have high trend, some have none. This would lead to a bias in the weight contribution percentage.

    I understand your point about RegEM convergence noise but the matrix is almost exactly the same so I wonder if it doesn’t still work with very small changes.

    On a third point, consider that in the peninsula you deweight or add weight to a single station, other stations are also very similar in covariance. If you change one by 0.2 it might get naturally deweighted because of its dissimilar trend and others would take over giving a less than true weight in the reconstruction.

    If I’m understanding it, the first table is more correct and a decreased value of added trend should either increase the contribution of the peninsula or create very little change. It will be interesting to see a smaller number used.

  15. Nic L said


    Yes, each station is detrended by the amount of its own trend, but in measuring a station’s weight I divided the reduction in the reconstruction average temperature trend resulting from detrending that station (the figure in the final two columns of my tables) by the trend of that station. I don’t see how this gives rise to a bias?

    I tried adding a trend of 0.01, without reducing the RegEM convergence tolerance, and the results didn’t look sensible. I am hopeful that adding a 0.02 trend and reducing the tolerance to 0.0002 will give resonably accurate results – the script is still running.

    I agree that the effect of detrending, or adding a substantial trend to, a single station which is one of a group of stations that are highly correlated may not give an accurate result for its contribution to the average reconstruction trend, because interactions arise between the effects of stations within RegEM. The direction of such interaction effects is not obvious to me. Interaction effects presumably account for the sum of the weights I measured by detrending individual peninsula stations (0.0308) being about 8% higher (not lower) than their combined weight measured by detrending all the peninsula stations (0.0285). But the fact that the difference for the peninsula was only 8%, and that it was much less for other areas, gives some comfort that interaction effects are not large enough to make the weight estimates seriously out.

  16. Jeff Id said

    I’ll need to take a minute and read the script because if you detrend a station by .04 and another by 0.07 the weights after the fact need to be adjusted to figure out the station contribution to the reconstruction. To me it seems that the 15% from table one is closer to reality from the matrix back calculation which came up with about 25% after names were corrected. I doubt that back calculation can be very far off so I have doubts that the 8 percent value is representing weight.

    If you detrend a single peninsula station, the others are so similar the the detrended one could be deweighted and others increase to balance the trend and the detrended one would appear to have very low weight/effect. I guess I’m not confident that this method is giving a good result because of that.

    It might be fine but why wouldn’t the very similar neighbor station simply take up the weight almost in the high density peninsula area? I’m not being critical, it’s interesting and well done. I’m just trying to describe areas which may cause problems with this analysis.

  17. Nic L said


    The station contribution to the reconstruction trend (calculated in Excel from the exported weights and trends matrix wtc) is the station weight are multiplied by the station trend.

    I have now completed running my script with trends of 0.02 added and analysed the results. I used the same basis as for Table 2, using the Climate Audit data, but with the fixed 0.02 trend added instead of detrending. This gives results much closer to my Table 2 than to my Table 1, bearing out your view that adding a trend as high as 0.2 distorted the Table 1 results . In fact, the new weight for the peninsula (including island) stations is lower than per Table 2, at 6.9 rather than 7.9. However, the contribution of the peninsula stations to the reconstruction trend is slightly higher than per Table 2, at 0.033 vs 0.031.

    I think that it may be more useful to concentrate on the contribution of stations in an area to the reconstruction trend, rather than the simple sum of the station weightings – which gives equal emphasis to stations with a small trend to those with a large trend (which have a much greater impact on the average reconstruction trend). The average peninsula /island station trend, weighted by the number of observations, is 0.319. With the peninsula comprising only 2.6% of Antarctica by area on Steig’s definition, on a fair method it would therefore only contribute 0.008 to the average reconstruction trend. Arguably a bit less, as on the whole peninsula stations have fewer observations than average. So RegEM’s average reconstruction trend is boosted by approximately 0.024 (average of 0.033 and 0.031, minus 0.008) by the peninsula stations, or by some 25% (from 0.094 to 0.118).

    You ask about detrending stations one by one “why wouldn’t the very similar neighbor station simply take up the weight almost in the high density peninsula area?” I think the answer is that stations acquire their differing weights in RegEM only via their differing covariances with other stations, which alter little when some station is detrended (other than with that station). Detrending a station will alter its own covariances with other stations, by a greater or lesser amount depending on the relation between its trend and its standard deviation. But the detrending will only affect the covariances between other stations (including the satellite PCs) to the extent that their imputed values change, which interaction effect appears in most cases to be fairly minor. Apart from that, there is no mechanism that I can see for other similar stations to take up the weight of the detrended station.

    I have looked at the effect on the final RegEM covariance matrix C of detrending Arturo_Prat, a peninsula station with a high trend (0.365) and a significant weighting. Its covariances with other stations and the satellite PCs were moderately changed, as one might expect. But the covariances of Bellingshausen, a nearby peninsula station with a similar trend and weighting, with all data series other than Arturo_Prat were almost unaffected by detrending Arturo_Prat. The same goes for its effect on the covariances of Faraday and O_Higgins, two other similar stations. The fact that the effect on the average (spliced) reconstruction trend of detrending all the peninsula stations as a group is similar (at 0.030; my previous figure of 0.0285 left out Macquarie) to the sum of the effects of detrending them one by one (of 0.031) supports my view that detrending a station gives a fairly accurate measure of its contribution to the average reconstruction trend. Further support to the accuracy of the detrending approach is given by the very similar effect of simply removing all the peninsula stations from the RegEM input matrix. Doing so reduces the average reconstruction trend by 0.032.

    I haven’t seen the script for your (corrected) matrix back calculation, but so far as I can see the code for your B matrix method sums the weight of the stations for each area for all data series being imputed for each month. Maybe I am missing something, but it seems to me that will give a different measure of weight from the weight of the stations in the average reconstruction trend – which depends only on the reconstructed satellite PC series, not on any of the surface station series. I modified your B matrix code to sum only over the satellite PC series (weighted by the contribution of each to the average reconstruction trend and using signed rather than absolute values) and got a much lower percentage: about 9% for the 1957-1981 period (after which the satellite PCs are not imputed by the B matrices, of course).

  18. Jeff Id said


    Thanks for the long reply. Regarding the back calculation we found that the answer solved fairly close for the single solution (not the B) version. The weightings matched the output from Steig et al closely yet I had the wrong station names. Just correcting the names still left us with a much higher peninsula weight. When you first sent the post, I thought it would do a good job of getting a trend weight but after a few days of thinking about it, I’m not confident that’s what it’s doing.

    Consider a situation with 3 stations and the satellite pc’s. Assume 2 of the 3 are nearly the same data. Delete one of the two, how much change will there be to the reconstruction? I would say not very much but that may not relate to how much trend contribution there was in the full 3 stationrecon.

    Now in your situation you’ve added a trend, like you I would expect the HF information to dominate the signal and the delta trend in the result to follow according to the weights. However, we know from single matrix back calculation that 8 percent is too low (the weighted graph was near perfect) and the single solution is not that far off.

    If you take the weights you’re calculating for each station, can you reconstruct the steig curve reasonably closely? That would settle it.

  19. Nic L said


    On your example, I agree that if you remove one of 3 stations that had nearly identical data then the effect on the reconstruction will be less than one-third of the total, and that that effect is a theoretical weakness in my method. But the fact that the change in trend is almost the same whether peninsula stations are detrended one by one, all at once, or removed entirely, demonstrates that the error caused by this effect is very small.

    My weightings do not enable the monthly movements in Steig’s reconstruction to be emulated accurately, nor would I expect them to, because they only measure the overall effect on the reconstruction trend of each of the data series. That effect is not spread evenly over time for any of the data series. Further, the effect on the trend of a station is reflected not only through a greater or lesser direct weighting for it in the B matrices used in calculating the reconstruction but also through changing the weighting of other data series in the B matrices. So the effect of a station on the HF monthly wiggles of the reconstruction and its LF trend may be different.

    I have now run your back calculation script with the station IDs corrected. If I have not made any mistakes, it gives a total absolute weighting for the peninsula & island stations of 25% (0.272/1.095) , but a net weighting of only 8% (0.072/0.895) – some of the weights being negative. Is that right? The net weighting is quite close to my Table 2 figure, but I think that is just coincidence. Perhaps I am missing something, but I am not convinced that they are an accurate measure of the influence of the various stations on the average reconstruction trend. Although the back calculated weights give a good fit to the reconstruction over 1957-1981, just using lm to fit the reconstruction over 1957-1981 to the 34 surface stations with data gives a better fit, and a net weighting for the peninsula stations of only 4%.

    The back calculation equations to which the weights are the solution include imputed as well as actual surface station temperature anomalies, with the same set of weights applying to both actual and imputed values. But the imputed values are themselves weighted combinations of values from other surface stations that do have actual data for the month in question. Further, the weights in those combinations depend on actual and imputed values of all data series for all months, as they are driven by the overall covariance matrix.

    Quite apart from my doubts as to what the back calculation weights really measure, it seems to me that they should be independent of the choice of index rows. But if, for example, the index rows used are searched for in reverse order, from 300 to 1 (which works if the station are ordered using (i in c(24:29,1:23,30:34)) when searching for their index rows) then I get weights that are significantly different, and the net weighting of the peninsula falls from 8% to 4%. The fit to the reconstruction in 1957-1981 is then, incidentally, even better than your original result. So your back calculation does not seem to give a unique result for the weightings, unless I have done something stupid.

    Also, the back calculation only uses 1957-1981 data and gives zero weights to data series where there is no pre 1982 data (including the satellite PCs). However, when I ran the calculation omitting those stations from RegEM the trend changed from 0.119 to 0.126 and the weights given to the 34 surface stations with pre 1982 data were completely different (e.g. Arturo_Pratt went from 0.0421 to 0.2709), showing that stations with no pre 1982 data do affect the reconstruction in the period 1957-2001.

    I’m sorry, I didn’t mean to appear critical your back calculation method, which is an impressive piece of work, and one which I initially thought would give a valid approximation. I imagine that the method would do so in a case where, as in the example in Steve M’s “TTLS in a Steig Context” algebraic analysis, there is no missing data in the surface station records.

  20. Jeff Id said


    I’ve never looked at the weights as anything other than absolute value so I’ll have to check. It would make sense if the back calculation matched this method, things would be well verified.

    I don’t mind criticism, and rather enjoy thoughtful criticism. Your work is appreciated here more than you know – no consensus required! 🙂

    I believe the back calculation to be fairly accurate because in the reconstruction of Steig data (Figure 4 I think) of the original post it was simply w1 * c1 + w2 * c2 ….. The influence seems cut and dry to me so I don’t understand your thinking on it not being an accurate measure of influence, unless you mean the back calculation solution is unstable or something. Perhaps you can explain a little more.

    Also, I wonder why do you think the 8% match to your calculation is coincidence, were there other areas or weights that didn’t match? If you’re just looking at peninsula stations there may be some blending of weights from your method because of their similarity of information so peninsula stations may get scrambled a little.

  21. Ryan O said


    As you mentioned, your method biases weights toward stations with long station histories. Adding a .02 Deg C/decade trend to Amundsen Scott does not have the same impact as doing it to Asuka, because that trend is not carried through all 600 months. In periods where the station data is missing, the imputed values will be based on the fit between that particular station and the other, non-trended (detrended) stations. That fit dampens the magnitude of your trending/detrending.

    If you look at the solution for a particular station – like Asuka – after the imputation is complete, you will see that the solution trend is markedly different from your added trend. This is because, with short station histories, the HF wiggles are what is matched – not the LF component. You can easily see this by comparing how r is penalized by missing the HF vs. LF as history decreases. This results in the imputation of Asuka to remain largely unchanged in Xmis despite the change to X because Asuka shows good correlation to the remainder of the Peninsula stations. By comparing a detrended Asuka to a detrended Amundsen-Scott, you’re comparing apples and oranges.

    I believe your method would work if you extracted the Asuka solution from Xmis, detrended that, and then plugged that back into RegEM in its entirety.

  22. Ryan O said

    I forgot to mention – the reason for this is that you only update B when the number of stations changes, and Xmis depends on B. This means that stations with long station histories get updated quite often, but ones with short ones don’t. While this may be an okay approximation (you would have to do verification testing to show this), it is likely to be highly biased because the station information is so noisy. I’m not sure it would be valid to base B on so few samples.

    So I suppose I should add the caveat that your method might work assuming that computing a solution for B only when the station complement changes (and, hence, basing Xmis on far fewer points) provides a valid approximation of Xmis.

  23. Nic L said


    Thanks for your comments.

    I can see the merits of using absolute weights for some purposes, but when one is trying to see whether the peninsula as a whole is over-weighted in terms of its influence on the reconstruction trend it seems to me that taking signs into account must makes sense.

    I fear that the back calculation is indeed unstable, as illustrated by the widely differing weights for the peninsula depending on in which order the index months are chosen. Try substituting in your script:

    for(i in c(24:29,1:23,30:34))
    while( ([j,i]) == TRUE) | (sum(index==j)!=0) )

    and see the difference it makes to the total weight for the peninsula stations. I don’t see that would happen if the method were valid.

    I think that the basic problem here is that your equation:

    T output = (C1 * SST1) + (C2 * SST2) ……. (Cn * SSTn)

    can only sensibly be regarded as measuring the influence of each surface station over the period concerned when solved using only actual data. But your method solves it using imputed as well as actual data.

    If one were, contrary to my view, to accept as valid the use of imputed data in solving the equations, why not use lm on the actual + imputed 34 surface stations data over the whole 1957-1981 period, which gives a much better fit (and a much lower weight for the peninsula), rather than just over the index months (which is what your method in effect does)?

    Also, a lesser issue but as you know the weights differ according to which stations are missing data for the month concerned.

    I think that your B matrix method (summing only over the satellite PC series, weighted by the contribution of each to the average reconstruction trend) avoids the first problem. It also helps solve the second, although it is unclear to me what form of averaging of the B matrix weights would be appropriate. There would however still be the problem that no account was taken of the indirect effect of data, in any period, changing the B matrices for a month and thus the reconstruction trend.

    I agree that similar peninsula stations may be substituted for each other in measures of the weighting of the peninsula. I thought the 8% match to my calculation was coincidence primarily because I got quite different figures using your method depending on how I chose the index rows, and also I couldn’t see how a method based on solving equations that involved imputed as well as actual surface station data could give the right result. Further, my figures for West Antartica and the Ross ice shelf weighting didn’t come close to yours.

  24. Nic L said


    21, 22: Thank you for your comments. You raise some interesting questions.

    I agree with your point that the effect of the LF component (trend) is decreased relative to the HF component (wiggles) as the length of the station history decreases. As you say, the effect of detrending a short history station like Asuka, on the imputation of the station itself and on the average reconstruction trend, will be much less than that of detrending a long history station like Admunsen Scott even if they both had the same trend and HF correlations with other data series. I agree that that, as a result, my method (or indeed back use of B matrices or any other methods involving RegEM’s output) does not measure how much influence temperature trends at each the surface station location would have had on the average reconstruction trend had they all been measured throughout the 1957-2006 period. However, whilst that is an interesting question from a theoretical point of view, it wasn’t what I set out to answer.

    I should have clarified that when I wrote that I was setting out to answer the “what weight the temperature trends in each surface station have in the overall temperature trend of the reconstruction”, I meant the trends in the actual surface station data series used by Steig, not what the trend would have been had there been a surface station at that location producing data for the entire 50 year period. If we are interested (as I think we are) in working out to what extent Steig’s average reconstruction trend was biased by the inclusion of unrepresentatively large or small numbers of surface stations in particular areas, or indeed by anything else, then surely it is the data that actually went into producing Steig’s results that matters? A short history station like Asuka would have had little effect on Steig’s average reconstruction trend, and that is what my method reports. So I don’t see that the fact that a short history station, and the detrending thereof, has less effect on the average reconstruction trend is a problem in itself.

    However, I think you may also be raising a more subtle point, that if one has a short history station with high correlation with nearby stations (Deception is a better example than Asuka, which is not a peninsula station), its imputed data (solution) will be pulled towards the record of the correlated stations, and will thereby affect the average reconstruction trend in a somewhat similar way to those stations (both actual and imputed data points being treated the same in the covariance matrix that drives the B matrices). In effect, a short history station would to an extent act as a surrogate extra copy of other stations with which it was correlated. Importantly, the station’s imputation would reflect the trends of the correlated stations even when the station’s own data was detrended, and the effect of the existence of that station on the average reconstruction trend would remain largely unaltered by such detrending.

    I think that is a valid point in principle. The mere existence in the RegEM input matrix of a station with a short history (of whatever trend) but during that period a strong correlation with other stations could bias the average reconstruction trend towards that of the correlated stations. If the effect were significant, it would presumably show up in the peninsula, where stations are close together and strongly correlated. However, the effect appears to be weak, and not to compromise my results, at least at the regional level. If it were strong, the effect of detrending all the peninsula stations together would be considerably stronger than the sum of the effects of detrending each peninsula station separately. But, as reported above, it is not. Nor is the effect of removing all the peninsula stations from RegEM entirely significantly different from either detrending all such stations separately or all simultaneously.

    I initially thought that your suggestion of extracting the short history station solution from Xmis, detrending that, and then plugging it back into RegEM in its entirety would work, assuming that the actual data were also detrended at some stage. But by doing so the detrended imputed data is treated by RegEM as actual data and directly affects (via the B matrices) the imputation for other correlated stations with missing data, and so has a major distorting effect on the average reconstruction trend.

    I’m not sure I follow your thinking when you say that “stations with long station histories get updated quite often”. As you say, the actual B matrix used varies as the different months with missing data are updated in turn during the iteration, changing whenever there is a different set of stations that actually have data for the month involved. However, all stations get their Xmis imputed data completely updated on each iteration of RegEM, using the B matrices computed from the covariance matrix (based on actual + imputed data) as it stood at the start of that iteration. In any event, the problem outlined in the previous paragraph seems to make the approach you suggest impracticable.

    Maybe the best alternative to detrending for a short history strongly correlated station is simply to entirely remove it from the RegEM input matrix, which would obviously prevent it acting as a surrogate partial extra copy of stations it is correlated with? Removing Deception from the RegEM input reduces the (spliced) average reconstruction trend by 0.00012 C/ decade, actually much less than the reduction of 0.0005 reported by my detrending method (it is closer to the reduction of 0.00004 calculated using my approach but adding 0.02 to the trend instead of detrending). But relative to the overall effect of the peninsula stations on the average reconstruction trend, the 0.00038 difference is only 1%.

  25. Ryan O said


    Thanks for the clarification. I misunderstood what you were trying to accomplish. It is subtly different from what Jeff and Steve had done, and I don’t believe the results are directly comparable.

    Your analysis is determining the approximate contribution of the actual data to the overall trend. Steve’s/Jeff’s analysis is determining the contribution of the imputed station data plus actual data to the trend.

    The Steve/Jeff analysis is done that way because the imputed values for the AVHRR PC are dependent not only on the actual station data, but since X is updated by Xmis after each iteration, they are dependent on the imputed data as well. While the column representing “Asuka” may not have much actual data (and so the actual data has minimal impact on the solution), the fact that there is a column representing “Asuka” will cause that column – be the data actual or imputed – to have an effect on the trend.

    Remember that the solution is driven by the SVD approximation of the correlation matrix (it’s on the correlation matrix – not the covariance matrix, because the 1/D factor scales each column to unit variance – so you get exactly the same result if you substitute cor(X) for matrix C, though you still have to calculate 1/D in order to unscale after each iteration). Because it is on the correlation matrix, the low-order EOFs will contain the information from the stations that contain the most common patterns, while stations with less common patterns are relegated to the higher order EOFs (which are thrown away). The solution, then, will be primarily driven by which patterns are most represented in the data set. In this case, it is the Peninsula stations that have the highest representation. Therefore, the SVD guarantees that the Peninsula station information (imputed and actual) is captured, while some other station information is not.

    This is not axiomatically a problem; since the AVHRR PC solution still depends on the relationship of the Peninsula stations to the PC. The Peninsula information, of course, is not necessarily copied to the PCs unless there is a high degree of correlation in the calibration period. However, whatever the relationship might be, for it to be meaningful, the eigenvector for that AVHRR PC must explain variation at the locations used for the regression. If it does not, then that PC is not a description of temperature at that location and the relationship is physically meaningless.

    In this specific case, it is a problem because if you look at the AVHRR data, there is a decent correlation between the Peninsula and AVHRR PCs 2 & 3. While the eigenvectors for these PCs have low weights in the Peninsula, RegEM has no way of knowing that. All it does is find the relationship between the high-order EOFs (which includes imputed data) and existing PC data. RegEM doesn’t know that PC 3 explains virtually zero variation in the Peninsula; it simply finds the relationship and extrapolates. So you end up with the perverse situation where stations located in regions where a PC explains zero variation drive the imputation of the PC. Not only that, but it is a self-reinforcing proposition. As the imputed solution for short record length stations approaches the stations the actual AND imputed data show the most correlation with approach each other, the chance of that group of similar patterns being represented in the low-order EOFs increases.

    That is what the Steve/Jeff analysis does: measures the resulting imputed and actual data contribution to the trend. It is also why removing 1 station (or changing 1 station) may not have much of an impact if there are still a sufficient number of stations to force their pattern to be represented in one of the first 3 RegEM EOFs. You will only see the true impact when you remove enough stations to force that EOF into the higher-order EOFs.

    For the B matrix:

    Matrix Xr is computed using both actual data (V11) and imputed data (V21). This is not broken cleanly like it is in Steve’s diagram on CA, since Steve first imputed the ground station set in its entirety for simplification. V11 and V21 are amorphous, having elements from all over the combined matrix. The actual and imputed values (and the relationship between them) of course changes from month to month regardless of whether the stations with actual data changes.

    So your B matrix is composed of a sampling of the real B matrix. In the full matrix, the values in December may not be the same as the values in November regardless of whether the station complement is the same. By using the November value in December, you are carrying forward white noise as signal. By computing each month individually (assuming the noise is approximately Gaussian), the noise gets largely cancelled out. That is my concern with the sampling method you chose for B. Yep, Xmis gets calculated for each month, but B does not, and Xmis depends on B.

    For long station histories, this cancellation may occur anyway because the number of points you are sampling at that location is much greater (since the actual data persists throughout many station changes). But for the short station histories, you might only sample B a couple of times, which would seem insufficient to prevent capturing a large amount of the noise.

    The reason for taking the entire solution from Xmis and plugging that back in is because, as far as RegEM is concerned, the SVD of the correlation matrix does not care whether a value is real or imputed. All it knows is that a value is present.

  26. Kenneth Fritsch said

    I think that this thread is as good as any to put down my current thoughts on the analyses of the Steig reconstruction.

    What I find revealing about the Steig authors explanation of their methods is that the reader could easily conclude that the methods were optimal. On the other hand, when I read analyses that have been under taken here and at CA, I get the distinct impression from the less than mathematically precise language required in describing the results that a valid black and white evaluation of the Steig methods is not attainable.

    From this layperson’s perspective it appears that a proper evaluation requires the sensitivity tests that have been employed in these analyses. It also appears to me that the further we get away from the simpler methods of reconstruction the less straight forward is the evaluation of the method. To that end I wanted to put forward some simple-minded approaches to analysis and reconstruction.

    The major critical element in the Steig reconstruction is the AVHRR data as measured and adjusted for the 1982-2006 period. In my mind the first question that needs to be answered about the AVHRR data is whether it is reasonably in line with the station data and AWS with regards to determining temperature anomaly trends. Could not this be done by using the available surface station and AWS data without infilling and comparing the resulting trends with those derived from the AVHRR data from the proximate grid cells? I do not recall the proximity of the AWS stations to surface stations but could not the AWS stations be used as an independent standard for comparison. The intent here should be to validate using the AVHRR data and then if validated using the AVHRR data to infill the missing station data. If we do not see a significant difference in trends between AVHRR and stations we have more confidence in using the AVHRR data as noted above. If we see differences than we have to decide if the AVHRR data is the more correct of the station data or whether neither can be trusted. A third and fourth comparison using AWS data might clarify the AVHRR to surface comparison.

    If our comparisons cannot give us confidence in the AVHRR data (we see significant differences), I am not sure how we would proceed. If the surface and/or AWS station data were thought to be more correct then we could perhaps proceed by doing a reconstruction using the surface data as the standard and then one with the AVHRR data as the standard. That process immediately becomes more complicated by the missing data from the station data and how it would be in-filled. It is at this point that I think without showing the validity of the AVHRR data, the Steig reconstruction meets a roadblock that it cannot get beyond.

    If we can show good confidence in the AVHRR data we have at least a decent picture of the entire Antarctica for the period 1982-2006. In order to proceed to the 1957-1981 period I judge that we need to have not only shown confidence in the AVHRR data but the surface stations also, i.e. the two data sources must be in good agreement.

    Given that we have established the data series agreement, we can proceed with a reconstruction of the 1957-1981 time period. This reconstruction has two critical factors: (1) that the relationship of the station data to AVHRR grid points remain reasonably constant over time and (2) that the large amount of missing data from the station data for this period can be accurately in-filled.

    Testing the stability of the grid point to station relationship would have to be established in the 1982-2006 period. We actually have a rather complete set of AVHRR data at these locations to make this test. Remember that we have already required that we have to have good confidence in the AVHRR data to proceed.

    To infill the missing station data will require as simple a method of infilling as we can conceive. Since we have already required good confidence in the AVHRR data and a stable relationship between the station locations to AVHRR grid points, we need only use the relationship that can be established in the 1982-2006 period using the AVHRR data at the station locations to the AVHRR grid point data that is without a station in the locale.

    I think at this point we would have sufficient data to make the reconstruction complete. Remember, however, that to do this we required that the AVHRR data be correct with reasonable confidence and that the AVHRR grid points to station areas data relationship be stable with good confidence and finally that the AVHRR data and corresponding proximate station/AWS data be in good agreement. If any of these assumptions are invalid I say that we (and the Steig (2009) authors) are better to stick to back of the envelop calculations.

  27. Nic L said


    Thank you for your insightful comments. I agree with much of what you say but I do have some remarks.

    To clarify further the differences between my and Jeff’s analysis of station weights, mine is as you say determining the approximate contribution of the actual data to the overall trend. It does so over the full 1957-2006 period and includes indirect effects of the actual data for the station concerned on the data imputed for all stations with missing data in all months (not just when actual data exists for the station), mediated via the effect of that station’s actual data on the correlation matrix and hence on the B matrices. Notwithstanding the questions that Jeff and you have raised, in practice the station by station detrending approach appears to give answers to this question that are supported by other evidence (from detrending or omitting stations over entire areas).

    I agree that Jeff’s back calculation analysis determines the effect of the imputed plus actual data for each station; actually on the average reconstruction temperature over the 1957-1981 period: I am not sure the result translates directly into the magnitude of the effect on the 1957-2006 trend. In principle, Jeff’’s method attributes the effects of imputed surface station values on the average reconstruction to the station where they are imputed for the period in which imputed whereas mine attributes it to the underlying stations (and satellite data) causing the imputation, taken over the full period.

    For example, if there was a station in the peninsula with a very short record that was perfectly correlated with a station in east Antarctica with no missing data, so that the peninsula station’s imputed values (ignoring the effects of regularisation) reflected just the record of the east Antarctica station, then my method would attribute the effects of those imputed values to east Antarctica whereas Jeff’s method would attribute them to the peninsula if the stations record was pre-1982 and ignore them altogether if its record was post 1981.

    Steve’s analysis effectively took the surface station imputed values as fixed for the sake of his algebraic argument, but in reality they are functions of all the actual data, as all data points affect the correlation matrix and hence the B matrices. Therefore, Jeff’s back calculation of the average reconstruction temperature does not give a unique solution (there being far more than 34 unknowns, the imputed values not being fixed), and the solution that it gives is strongly dependent on the rows selected as its input. Using lm on all pre 1982 data probably gives the best available solution (and one that differs greatly from Jeff’s solution as regards the peninsula weight). However, even then the effect of post 1981 data (which is important in determining the correlation matrix, and affects the pre-1982 imputation) is ignored.

    I entirely agree your comments about the effects of the SVD and regularisation process. I find it instructive to look at which surface stations have high values in the first 3 columns of the V matrix of the svd of the correlation matrix, all other columns being ignored when regpar=3. Using a cut off absolute value of 0.16, they are:

    1st regularisation dimension (on which satellite PC1 has a modest correlation, and satellite PC3 a slight correlation, both with the same sign as the surface stations):
    Arturo_Pratt, Bellingshausen, Deception, Esperanza, Faraday, Ferraz, Great Wall, Jubany, King Sejong, Marambio, Marsh, O’Higgins, Orcadas, Rothera, San Martin, Signy

    2nd regularisation dimension (on which satellite PC1 has a substantial correlation with the opposite sign to that of the surface stations):
    Adelaide, Admunsen_Scott, Asuka, Belgano_II, Casey, Davis, Dumont_Durville, Leningradskaja, Macquarie, Mario_Zuchelli, Mawson, McMurdo, Mirny, Molodeznaja, Novolazarevskaya, Syowa, Vostok, Zhongshan

    3rd regularisation dimension (on which satellite PC2 has a substantial correlation with the opposite sign as the surface stations, and satellite PC3 has a substantial correlation with with the same sign as the surface stations, or vice versa where the station name is in brackets):
    Adelaide, (Belgrano_I), (Belgrano_II), Byrd, (Grytviken), (Halley), Leningradskaja, Mario_Zuchelli, McMurdo, Russkaya, Scott_Base

    As you say, the first dimension (the first reconstruction PC) is comprised of peninsula (including island) stations. So far as I can see, the direction of the correlations are such that an increase in temperature in these peninsula stations reduces the average reconstruction trend, which is mainly determined by satellite PC1 with a negative weight. This seems surprising; maybe I have got something wrong. However, it is consistent with several of the peninsula /island stations having a negative weight in my contribution analysis.

    The second dimension /reconstruction PC is almost entirely comprised of east Antarctica stations, including 2 stations designated Ross ice shelf, I think originally in your analysis.

    The third dimension /reconstruction PC comprises the two west Antarctica stations along with a mixed bag of other stations, most of which also appear in the first two reconstruction PCs, and some of which have different signs from the rest.

    Interestingly, satellite PC1 has a substantial value on the 4th and 10th columns of the V matrix, suggesting that using regpar=3 is over-restrictive even when only 3 satellite PCs are used.

    I agree that 3 satellite PCs are insufficient to resolve the peninsula separately. I think we all share the view that using more satellite PCs would have been appropriate. However, I wasn’t sure if I had fully followed your statements about there being a decent correlation between the peninsula and AVHRR PCs 2 & 3, but the eigenvectors for those PCs having low weights in the peninsula and PC3 explaining virtually zero variation there. In that case wouldn’t it have a negligible correlation? The low numbered AVHRR cells, which I believe are in the peninsula, do seem to have generally higher weights on PCs 2 & 3 than on PC 1.

    I understand the concern about imputed data for short history stations that are well correlated with high trend stations reinforcing the characteristics of stations with which they are correlated, but I am not convinced it is a major issue in practice. The prime candidate for such a station is Deception. As I said, removing this from the RegEM input reduces the trend by 0.00012 C/decade, less even than the change I got from detrending Deception but leaving it in RegEM. However, I note your point that removing one station may not have much effect if the pattern of similar stations remains represented in one of the first three RegEM EOFs/PCs. That is a good point, but it is difficult to predict the effect. As the 1st PC/EOF that currently carries the peninsula “signal” appears not to carry much weighting for temperature, the effect of the peninsula signal dropping out of the first 3 EOFs/PCs may not be that great. The obvious way to estimate the effect is to remove all the peninsula /island stations from RegEM, which I calculate reduces the trend from 0.1172 to 0.0855 (using the CA archived data), a fall of 0.317, a figure that is in line with my estimate of the total weighting of the peninsula derived from detrending the individual stations one by one. That suggests the effect is minor.

    Am I right in thinking that your comments on the B matrices relate to my attempts to speed up the execution of RegEM by only calling the pttls sub-routine when the set of station with actual data changes? I’m afraid I don’t understand your problem with my changes to RegEM – may be I am missing something. You say that Xr is computed using both V11 and V21, which depend on actual and imputed data. I agree. You then say that the data changes from month to month regardless of whether the set of stations with data changes, and imply that that means that the B matrix changes. But V11 and V21 depend only on svd(C) and on what stations have actual data. In the original RegEM script, C is only calculated once each main RegEM iteration. It is not recalculated every month within each RegEM iteration as j is incremented through the months. I have simply retained that basis of calculating C, and svd(C), once each main iteration. So, if your point is valid, it applies equally to the original version of RegEM.

    Apologies for the length of this reply !

  28. Ryan O said


    My concern has nothing to do with C. C is calculated based on the actual AND imputed values in X, not just actual data. Remember that the missing values in X are updated by Xmis at the end of each iteration. C is calculated based on that.

    The concern is that the solution for Xr does change every month because kavlr[[j]] has a different set of values for each month. These lines:

    for( j in 1:n) { kavlr[[j]]= (1:p)[!indmis[j,]];
    kmisr[[j]]= (1:p)[indmis[j,]]}

    set up the missing and available values on a month-by-month (line-by-line) basis. Xr (and, hence, B) doesn’t just change when the stations change; it changes when the anomalies for each month changes – which happens every month, regardless of whether the station complement changes.

    However, I wasn’t sure if I had fully followed your statements about there being a decent correlation between the peninsula and AVHRR PCs 2 & 3, but the eigenvectors for those PCs having low weights in the peninsula and PC3 explaining virtually zero variation there. In that case wouldn’t it have a negligible correlation?

    If the eigenvectors were calculated using station data, yep, this would be true. But the eigenvectors are satellite data, and there are differences between the satellite data and ground data. You could do PCA on the stock market and find correlation to stations in Antarctica. That doesn’t mean that they are physically related or that you can use one to predict the other.

    As far as the signs of the correlations go, to make sure you have the right sign, you need to multiply the station data by the corresponding eigenvector weight in the station location to accurately use station data to predict satellite data. That is one of the problems with the Steig method – it can’t distinguish whether stations should be positively or negatively correlated because it doesn’t know the eigenvector sign at the station location. That, also, causes problems and might be the source of the strong negative weights Jeff noted.

    If you look at the stations in parenthesis, and then look at the eigenvector weights at that station location, I bet most of those negative correlations turn positive – because you have to multiply the station information by the eigenvector weight before calculating correlation.

  29. Nic L said


    I now understand your concern, but I think you have misunderstood what kavlr and kmisr are. They are not lists of the missing and available values of X for each month, but rather of the column numbers of the missing and available data for each month. They therefore only change from one month to the next when the station complement changes. The first four months’ kavlr vectors are:

    kavlr[[1]] # [1] 2 5 8 13 15 18 19 23 27 28 29 34 39
    kavlr[[2]] # [1] 2 5 8 11 13 15 18 19 23 27 28 29 34 39
    kavlr[[3]] # [1] 2 5 8 11 13 15 18 19 23 27 28 29 34 38 39 40
    kavlr[[4]] # [1] 2 5 8 11 13 15 18 19 23 27 28 29 34 38 39 40

    Accordingly Xr and B also only change from one month to the next when the station complement changes, svd.ant being fixed during each RegEM iteration).

    Are you now happy with the way I modified RegEM?

    Thank you for your explanations on the correlations and eigenvector weights points. I think I had been interpreting the word “explaining” in a statistical sense whereas you had meant it in a physical one.

  30. Nic L said


    #26. I don’t think I’m the best person to comment in detail on your thoughts; Jeff Id and Ryan know much more than me about the satellite data in particular. So just a few brief comments, for what they are worth. The spatial sparseness of the surface station data and the evident shortcomings of, and lack of pre 1982 data for, the satellite records are a major limitation, particularly for the 1957-1981 period. Incorporating the AWS data seems sensible and would ameliorate the spatial sparseness of the surface station data to some extent after 1979. But I think that it is probably possible to show (a) that Steig’s reconstructions had upwards biases in their overall trends, to establish what the principal causes of those biases were and to have a shot at correcting them; and (b) that the spatial distribution of trends shown in Steig’s reconstruction are more an artifact of his methods (in particular his choise of 3 satellite PCs and regpar=3) than a representation of physical reality.

  31. Ryan O said

    #29 Nic,

    You are correct, sir!

    And I no longer have any objection to your modification. 😉

  32. Ryan O said

    However, I note your point that removing one station may not have much effect if the pattern of similar stations remains represented in one of the first three RegEM EOFs/PCs. That is a good point, but it is difficult to predict the effect. As the 1st PC/EOF that currently carries the peninsula “signal” appears not to carry much weighting for temperature, the effect of the peninsula signal dropping out of the first 3 EOFs/PCs may not be that great. The obvious way to estimate the effect is to remove all the peninsula /island stations from RegEM, which I calculate reduces the trend from 0.1172 to 0.0855 (using the CA archived data), a fall of 0.317, a figure that is in line with my estimate of the total weighting of the peninsula derived from detrending the individual stations one by one.

    I am much more convinced now that the effect of the Peninsula stations is less than we had initially thought and the bigger issues are the lack of sufficient AVHRR PCs and the fact that RegEM doesn’t know the magnitude and sign of the eigenvector weights at the station locations.

    By the way, how much faster did RegEM run with your modification?

  33. Nic L said


    #31 I am glad we have cleared that up 🙂

    #32 The effect of the large number of peninsula stations effects is by no means negligible – it biases the average reconstruction trend up by at least 0.02 C/decade, perhaps 0.025, on my rough calculations. Also, how much effect on the trend did the calibration corrections for satellite drift and ofsets that you identified have?

    I quite agree that the small number of AVHRR PCs is unduly restrictive. I can see no justification for it. I think that there are various other grounds for being suspicious of the effects of RegEM, as well as the eigenvector weights issue that you identify. For instance, the fact that it still produces a positive trend when all input series have been detrended worries me.

    I did not time RegEM before and after, but my modifications reduce the number of calls to the pttls subroutine by over half, from 600 to 261.

  34. Ryan O said

    Nic (and Kenneth – since Ken had made mention of the satellite data problem earlier),

    The satellite trend from 1982-2006 differs from the mean ground station trend by 0.07 deg C/decade. Not trivial. Unfortunately, this does not directly translate to reconstruction trends because the ground stations are not scattered uniformly across the continent. In the reconstruction (depending on how you do it) the trend in the satellite data contributes anywhere from ~0 (my special version of truncated SVD) to ~ 0.05 (Steig version).

    The “warming” trend in RegEM is not necessarily a concern simply because the ground stations are not uniformly scattered. So based on the covariance determined from the satellite information, they could legitimately all be detrended yet the overall reconstruction trend comes out positive. However, it always seems like these errors go the way of warming. 😉

    Another related problem is something Jeff Id discovered a while back. He put a -0.2 deg C / decade trend on all of the stations and what popped out was about half that. It seems suspicious that the RegEM result dampens negative trends while amplifying positive ones. I think the root cause of this is the positive trend in the satellite data as compared to the ground stations. I haven’t tried it, but if you did the same thing with the satellite trend excessively negative, I bet that positive trends get dampened and negative ones get amplified.

    RegEM seems to have a problem with trends reinforcing each other when you attempt to use it for essentially principal component regression.

  35. Kenneth Fritsch said

    The spatial sparseness of the surface station data and the evident shortcomings of, and lack of pre 1982 data for, the satellite records are a major limitation, particularly for the 1957-1981 period.

    Major limitation of the satellite records for the 1957-1981 period has to be, at least in my view, the understatement of the week.

    Seriously, after thinking about the Steig reconstruction for my long winded post above I think I can summarize my thoughts. For the reconstruction to be valid, excluding considerations for the methods employed, requires all of the following:

    (1) the AVHRR data needs to be validated, (2) as does the station data, (3) validation for the relationship of the AVHRR data with the station data being reasonably constant over time, and finally (4)that a reasonable method of in filling is available for the missing station data in the 1957-1981 period.

    It is also my opinion that the Steig authors and we analyzers have not used all the data available to us in an attempt to do these validations.

  36. Nic L said

    #35 The major limitation I was referring to was of the ability to make a sensible reconstruction (particularly on any spatial basis), not of the satellite data itself. The lack of any pre 1982 data for the satellites means that any reconstruction for the 1957-1981 period is going to be much more inaccurate. (I know that Steig actually used the “spliced in” actual satellite data for the 1982-2006 period rather than a reconstruction using the RegEM solution, but in principle with poor qulaity satellite data it would have been logical to do the latter, so I think of the 1982-2006 period as also involving a reconstruction.)

  37. Nic L said


    I forgot to say that the main reason that I modified RegEM was to get it to produce and return the “unspliced solution” PCs and weights; in the course of doing so I realised that I could also speed it up by reducing calls to the pttls subroutine (which of course was called 600 times per iteration, not 600 times in total).

    I entirely agree with you that RegEM has a problem with trends.

  38. Ryan O said

    Nic and Kenneth,

    Another problem with the Steig method of splicing is that the reconstruction solution and the original three AVHRR PCs don’t mean the same thing. The AVHRR PCs are just that – PCs from the AVHRR data. They are orthogonal and have no ground information included in them.

    The reconstruction solution for the PCs, however, is different. Because the total weight of ground stations is much greater than the weight of each individual PC, the pre-1982 solution is a non-orthogonal rotation of the PCs. In the Steig case, they additionally do not constrain this rotation by the eigenvector – any station anywhere can cause the rotation based on correlation to the PC. So essentially, they rotate the PCs and then splice the non-rotated PCs onto the latter half of the reconstruction. The only way this would be mathematically correct is if the rotation were very small. However, it is not; PC 2 and 3 become nearly perfectly correlated in the pre-1982 timeframe. To make their method valid, the weights of the PCs must be increased in RegEM and the SVD done on the covariance (not correlation) matrix. As the weights of the PCs becomes much greater than the weights of the stations, the pre-1982 rotation becomes increasingly small.

    As far as the concerns about the match of the AVHRR data to ground data goes, the answer is dependent on how you use the AVHRR data. If, as Steig has done, you directly use a portion of it, you absolutely need to figure out why the AVHRR trend is so different from the ground trend in the 1982-2006 timeframe. If, however, you use none of the original PCs and rotate them to match the ground station data, then the embedded trend in the AVHRR data no longer matters. The only thing that matters is if the eigenvectors accurately reflect the covariance from stations to the areas between stations.

  39. Kenneth Fritsch said

    Ryan O, your explanations of the methods used and the weaknesses of the applications involved make sense to me, but my concerns here involve whether the basic data, required to make the applied “correct” methods work, is any good or reasonably good.

    Since in my estimation, both the station and AVHRR data must be shown to be good for reconstructing that means that the AVHRR data in close proximaty to the station locations must be in good agreement with that station data. That is the first critical check on the data to be used in reconstructing.

    If the first step receives validation, the next step requires simply comparing the AVHRR data in proximity of the stations with AVHRR data at locations away from it and determining if these data location relationships are stable over time in the 1982-2006 period.

    If step 2 receieves validation, the final step in my view is that an in filling method for the surface station data in the 1957-1981 period can be established that is simple to understand and most importantly can have a confidence interval derived for the in filling process.

  40. Jeff Id said

    #39, The satellite data isn’t a big problem, I don’t think. The reconstruction oddly enough is not a reconstruction of surface temperature datasets but rather a reconstruction of the AVHRR surface skin temperature dataset. The slope of the input data has little effect on the final result because only the high frequency covariance info is important. Remember, Ryan’s corrected (low slope) sat data has nearly the same result as the high slope data. It tries to match slope but if there aren’t any options (Surface stations with enough slope), the EM just does best fit wiggle matching, overshooting the wiggles in one area and under in another to compensate for unmatched trend.

    Since even with heavy cloud noise and satellite influences the satellite data should have the majority of the cold and warm high frequency delta’s correct, the imputation works. Something that’s been discussed only a couple of times in the comments, but the calibration is backwards. The real reconstruction of surface information is contained in the surface station records.

    What Steig et al did was to reconstruct the skin temperature with surface info. Since the AVHRR TIR skin temperature has a higher slope the backward calibration tends to favor high slope surface stations.
    Therefore the correct thing to do after running a reconstruction this way would probably be to recalibrate slopes by a fixed constant across the continent to the surface station data. This would fix the backward result and turn it back into surface air temperature.

    I hope Ryan catches this, I know he’s already aware of the problem.

  41. Ryan O said

    #48 The weighted eigenvector method resolves that. I would interpret it as a reconstruction of surface air temperatures, not satellite skin temperatures. The reason is that the eigenvector weighting forces the solution to the ground station temperatures, not the satellite temperatures. The match of the reconstruction temperatures to the ground station matrix is almost perfect.

    The eigenvector weighting method essentially only uses the covariance information from the satellite PCs to interpolate the empty space between the stations. The trends in the satellite data don’t matter, unless they are so large that the calibration is poor. The answer is unaffected by the trend.

    You are also correct in that the real reconstruction is the surface station imputation. The satellite part is just gravy to get a prettier picture. It’s not even a reconstruction per se since it simply uses the eigenvectors to fill in empty space. It’s the station reconstruction that is the real reconstruction.

  42. Ryan O said

    #39 I sent a post to Jeff. I think you will enjoy the approach. The first thing I do is a check on the accuracy of the ground station imputation. Then I do another check on the satellite data. Pretty much your steps 1 and 2, just with a bit of a twist on them. 😉

  43. Nic L said

    #38 Ryan,

    I agree that the reconstruction solution PCs and the AVHRR PCs are completely different. There are regpar solution PCs, which exist independently of the number of AVHRR satellite PCs or indeed whether there are any satellite PCs input to RegEM at all (there are none for the AWS reconstruction). But I think it is slightly misleading to say that the solution PCs (Xpcs as produced by my version of RegEM, and which are defined over the whole 50 year period) are non-orthogonal. They are orthogonal over the whole 1957-2006 period. They are necessarily therefore not orthogonal over the 1957-1981 sub-period, just as the satellite PCs are non-orthogonal over, say, the 1995-2006 sub-period.

    Whilst in formal terms any two sets of 3 PCs can be rotated into each other, I find it difficult to regard the solution PCs as being a rotation of the satellite PCs in anything other than formal terms. The solution PCs are derived using the iterated TTLS calculations, with the satellite PCs being only 3 out of 45 inputs. The reconstructed satellite PCs for 1957-1981 equal the solution PCs multiplied by the weights calculated in RegEM for the satellite PC data series; the surface stations reconstructions are calculated in the same way. So I would describe what happens as RegEM derives a set of 3 (=regpar) orthogonal solution PCs and weights to apply to them in order to reconstruct all the data series over the entire period, but leaves unchanged all actual data values. The actual values just happen in the case of the satellite PCs to be another set of 3 orthogonal PCs, but one defined only over the 1982-2006 sub-period. In formal terms they are accordingly a non-orthogonal rotation of the solution PCs, but I don’t see that that is the best way to regard them.

    I can see the attractions of using the solution PCs in the light of the doubts, inter alia, as to whether the trends in the AVHRR data are an accurate representation of surface temperature trends over 1982-2006. I think that is what you are saying in your final paragraph. But why do you say that the splicing result is only mathematically correct if the rotation is small? Although I am no expert on the subject, I have seen nothing to suggest that in the TTLS and EM literature. Are you saying that using RegEM to impute PCs that do not represent the same sort of variables as the rest of the data series is in principle invalid?

    I certainly agree that the high correlation between reconstructed satellite PCs 2 & 3 is not at all good, and I think it points to the 3 satellite PCs/ 3 reconstruction solution PCs being poor, if not close to being unstable as regards spatial information at least.

    Your suggestion of performing regem on the covariance matrix or vastly increasing the weight of the satellite PCs in some other way would indeed force the solution PCs much closer to the satellite PCs, but I don’t see why that would make Steig’s method valid.

  44. Jeff Id said

    #43, Nic, I’ve been following along in the background for some time. The issue I believe Ryan is referring to is the fact that surface stations are controlling the weighting of the 3pc portion of the imputation so it becomes a blend of surface station trend with skin temp trend – by Ryan’s mathematically correct statement he may have meant physically correct.

    Forcing the wighting to the PC’s results in an improved skin temp reconstruction (which you probably know is what Steig et al is). In it’s current format the recon is a mash of skin temp and surface air temp. By using high weights on the surface data the solution becomes a surface station solution as Steig et al. was supposed to be.

  45. Ryan O said

    #43 Nic,

    I think we’re mixing terminology. When I say “solution PCs” I mean the imputed AVHRR PCs, not the ones generated by the SVD of the correlation matrix in RegEM. For clarity, I will always refer to the imputed AVHRR PCs as the “solution PCs”.

    I agree entirely with your description, assuming you mean the correlation matrix PCs in RegEM. Those are indeed orthogonal, and there is no means in RegEM to rotate them.

    The rotation occurs with respect to the AVHRR PCs. The results of the imputation are essentially a rotation of the PCs to ground station data, with the amount of rotation being constrained by the relative weights of the station data to the PCs.

    I apologize for the terminology. I had been using “solution PCs” to describe the imputed AVHRR PCs for some time now, and I hadn’t noticed that you referred to the correlation matrix PCs in RegEM as the “solution PCs”.

    Now that the terminology has been cleared up, I’m curious as to whether you have any objections to my description.

  46. Nic L said

    #45. Ryan,

    You’re right, we have a confusion of terminology. 🙂 I will try henceforth to refer to the PCs produced inside RegEM as a solution to the TTLS equations as the “RegEM solution PCs” or the “TTLS solution PCs”; where Truncated SVD is used I will likewise use “TSVD solution PCs”. And I aim to use “imputed [AVHRR or satellite] PCs” or “[AHVRR or satellite] solution PCs” (or “unspliced solution” when referring to the 1957-2006 period) when referring to the (non-orthogonal) PCs for the AVHRR satellite PCs imputed by RegEM (or TSVD).

    Your description now makes sense to me. One statement you make I would however question. You say

    “If, however, you use none of the original PCs and rotate them to match the ground station data, then the embedded trend in the AVHRR data no longer matters.”

    I think you are saying here that the unspliced [satellite PCs] solution will be unaffected by the trend in the AVHRR data? I agree that the unspliced solution will be much less affected, but I have found that it is by no means unaffected.

  47. Nic L said

    #44. Jeff,
    Thank you for your clarification. It is unclear to me what exactly the relationship between surface station temp and skin temp is. Presumably it could in principle have a very good HF (monthly) correlation even if there is a trend/ LF drift? Whether or not that is actually the case I am not certain. As you say, Steig’s reconstruction is strictly of the skin temp but the recon is a mash of mostly (just above) surface temp and (a PCA representation of) skin temp. I agree that reducing the weighting of the satellite PCs in the recon brings it much closer to a pure surface station recon. Although in principle doing so loses the benefit of the spatial information contained in the satellite PCs, with only 3 satellite PCs there is arguably not much spatial info to lose.

  48. Jeff Id said

    #47 I agree. As I understand RegEM in the Steig version, the surface station information is assumed to be equivalent to the AVHRR. The AVHRR information has a higher slope which RegEM attempts to calibrate the surface station data to but there aren’t any DOF for changing slope and the recon is dominated by the HF covariance. The result should be a bias toward acceptance of the high slope stations however the HF info may overwhelm any detectable results.

    My point is that the Steig recon represents the best fit of surface station data to satellite AVHRR so it really isn’t surface station data anymore but rather a poor calibration to AVHRR data.

    It might be interesting to see a barplot of your weight results with the stations placed in order of trend. It’s probably noisy though. I think your post was received pretty well, let me know if you have anything else you want to put up.

  49. Nic L said

    #48 Jeff
    Many thanks for the suggestion of another post. I have in fact nearly finished a piece of work on the effect of trends on Steig’s RegEM reconstruction, which includes a bit on station contributions to the reconstruction trend. I think that the results will be of quite general interest. I am hoping to include a few charts / barplots this time ;). I will be in contact about in soon. I was going to include the results of some detailed work on the effects of weighting the stations and satellite PCs in carrying out the reconstruction, but I decided that the post would then be far too long. I will make that work into a separate, later, post if that suits. A barplot of my station weights (on contributions to the reconstruction trend) ordered by station trend would probably fit in better there.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: