## Effects of Surface Trends on Antarctic Reconstruction

Posted by Jeff Id on July 15, 2009

This is a guest post by NicL where he examines the effects of a variety of synthetic data on the Antarctic reconstruction. Nic has an interesting approach to his analysis which is always a good thing when looking at a complex algorithm. I’ve uploaded a Word document at the bottom which contains the original formatting and code for this post.

### ===================================================

UPDATE: Due to some new developments in the R version of RegEM used in this post. The results have been updated in a paper by Nic here. Although the result has improved it has changed very little and the conclusions are unchanged. A link to the revised post is here.

The effect of surface station trends on Steig’s reconstruction2

====================================================================

### The effect of surface station trends on Steig’s reconstruction of Antarctica temperature trends

In this article, I aim to show that RegEM, the Regularized Expectation Maximization algorithm used by Steig in its TTLS ((truncated total least squares) variant, is highly sensitive to trends contained in its input data, even when the trending data series have little or no relationship with each other apart from both having trends; to propose a method of counteracting this sensitivity and to evaluate it; and having demonstrated that the method works well to show that using it produces much lower average reconstruction trends than Steig’s method. But before I start, I would like to set readers a puzzle. Which surface station has the most impact on the average 1957-2006 trend per Steig’s reconstruction, and how much impact does it have? The answer, which I think will surprise most people, and casts further doubt on Steig’s results, is given in the last section of this article.

### The sensitivity of RegEM to trends in the data

There has been considerable discussion, in the context of Steig’s RegEM based reconstruction of Antarctica temperatures from 1957-2006 using satellite AVHRR data from 1982 on, of the likelihood that RegEM may impute a long term trend from one data series to another based purely on short term (high frequency) correlation between the two data series. I believe that is indeed a major concern. It is entirely possible for the temperatures at two sites to be affected similarly by short term factors but for them to exhibit entirely different trends over the long term. If, say, temperatures in the Antarctic peninsula exhibited a high frequency correlation with the data series representing the satellite temperature measurements, local temperature trends in the peninsula arising from oceanic causes could have a distorting influence on the satellite data series based reconstruction of Antarctica average temperature trends.

However, I have seen little discussion or investigation of the sensitivity of RegEM to data series that exhibit long term trends but which are not otherwise correlated in any significant way with the remaining data series – where high frequency correlations are negligible. I have therefore carried out some investigation into this subject. In order to do so, I first modified Steve M’s R-port of the regem_pttls function so that it would, inter alia, produce and report the RegEM TTLS solution, in terms of principal component (PC) time series (equal in number to the regularization parameter regpar) and weights on those PCs for each of the data series, from which the “unspliced trend” can be derived.

The modified regem functions script is set out at the end of this article. A script that tests its outputs for identity to those of Steve M’s original in the case of Steig’s main reconstruction is also given, along with scripts for all of the results set out below. I have taken advantage therein of some of the data preparation and map plotting functions from Ryan O’s comprehensive reconstruction script. All trends given in the text and scripts are in degrees C per decade, for the 1957-2006 period.

I will be presenting results for both the spliced and the unspliced Antarctica-wide average reconstruction trends. The unspliced trend is that calculated from the reconstructed satellite PCs imputed using the RegEM TTLS solution PCs over the whole 50 year period, as opposed to Steig’s spliced trend which uses the reconstructed satellite PCs from 1957 to 1981 and the actual satellite PCs from 1982 to 2006.

With considerable uncertainties as to the accuracy of the AVHRR data and the divergence between the trends therein and the more accurate surface station data, it seems inappropriate to allow the AVHRR data alone to determine the post 1981 reconstruction, as Steig did. I prefer to view the unspliced solution – which allows the satellite data (and trends therein, if not detrended) to influence the final result, but not to completely determine it over 1982-2006 – as the prime reconstruction result. The unspliced trend is not directly distorted by the high 1982-2006 trend (0.261) exhibited by the satellite data, which may not be an accurate measure of the near surface temperature trend, and the unspliced trend may accordingly be seen as a better measure of the true 1957-2006 temperature trend. To give an idea of the difference, if the reconstructed temperatures were flat from 1957 to 2006 at a level equal to the average actual satellite temperature over the 1982 to 2006 period, the spliced trend would be 0.033 whilst the unspliced trend would be zero.

There has been considerable academic debate as to whether the predictand data (the AVHRR satellite PCs in this case), at least, should be fully detrended before imputing missing values. Mann does not believe in doing so, and his co-authors Wahl and Ammann criticized von Storch *et al*. for doing so in their 2004 reconstruction of the past 1000 years’ climate. In response, von Storch *et al*. defended detrending and pointed out the dangers of nonclimatic trends being interpreted as a climate signal if trends were not removed. Steig’s main results were without any detrending, but (presumably recognizing that opinions on the subject were divided) he also gave results with the AVHRR data detrended. In both cases, he gave only the spliced solution.

#### The effect of adding linearly trending data series, with or without added noise

The effect of adding an additional data series consisting of a 50 year pure linear trend to my emulation of Steig’s main reconstruction is remarkable. Doing so increases the spliced average trend by 0.073 C/decade, from 0.117 to 0.190 – an increase of 62%. The unspliced average trend increases by 0.080, from 0.101 to 0.181 – an increase of 79%. How large was the trend of the series that I added? In fact, it makes no difference at all what the added trend is: 0.02, 0.2 or 2 C/decade trends all have exactly the same effect.

The insensitivity to magnitude is due to RegEM scaling all data series to unit standard deviation before calculating the covariance matrix and performing the TTLS calculations. Moreover, adding a data series with a negative trend, of any magnitude, has exactly the same positive effect on the RegEM output trend as adding a series with a positive trend. That is because RegEM is indifferent between incorporating with a negative influence a series with a negative trend and a negative correlation to other data series (two negatives producing a positive contribution to the reconstruction trend), and incorporating with a positive influence a series with a positive trend and a positive correlation to other data series.

Of course, adding a series with a pure linear trend is a little unrealistic. Where the standard deviation of a data series is dominated by random fluctuations rather than a linear trend, as is normally the case, the influence of a linear trend will be considerably diluted. I have modeled the effect on the unspliced solution of Steig’s main reconstruction of adding a data series consisting of a varying linear trend added to random noise with a monthly standard deviation of 1C. This is the same order of magnitude as the random fluctuations present in the surface station data, although most have a somewhat higher standard deviation than 1C. The results are shown in Figure 1 below, for trends between 0 and 2 C/decade. The same (positive) impact occurs even when the added trend is negative. It can be seen (a) that the effect is still substantial and (b) that it is non-linear.

When the added series incorporates a trend that is very small relative to random fluctuations, the effect on the RegEM reconstruction trend is minor. As the size of the trend added increases, the RegEM reconstruction trend increases much more sharply, but then the increase tails off as the linear trend comes to make up the dominant part of the standard deviation of the series. Adding a 1 C/decade trend increases the trend of the unspliced solution (the imputed values per RegEM for the entire period) for the reconstruction by 0.049 C/decade, or about 49%, and the spliced (imputed values before 1982, actual satellite data thereafter) trend by 0.045. As there are number of surface stations with trends of the order of 1 C/decade, it is quite conceivable that a significant part of the RegEM reconstruction trend reflects the effects of data series that have strong trends but (apart from that) little or no correlation with the other data series used by RegEM.

#### Replacing surface station data with random data having equivalent trends and variability

An obvious question which then arises is whether Steig’s satellite AVHRR reconstruction average temperature trend could have arisen wholly or mainly purely from the effects in RegEM of trends in the data, without that data needing to be correlated in any way with the satellite data (other than through both sets of data having linear trends). If using uncorrelated data with the same trends as the surface stations could have produced the same result for the average trend in the reconstructed satellite data, then the validity of the reconstruction trend would be highly questionable. If all the surface station data were uncorrelated with the satellite data, other than through both sets having trends, then the trend in the imputed satellite data would merely reflect some composite of the various trends in the surface station data, weighted according to certain factors, in particular the strength of those trends and the number of surface stations exhibiting them. In such a case, no link would have been established between the temperature fluctuations in the actual satellite data and that in the surface station data, or between data from different surface stations, so variations in surface station temperature could not validly be used to impute satellite data in the pre 1982 period.

To test the possibility that Steig’s trend could have arisen purely from the effects in RegEM of trends in the data independent of other correlations between the data, I replaced each of the surface station temperature series with a series consisting of a pure linear trend equal to that station’s trend, and added random (normally distributed) noise with a standard deviation sufficient to bring the standard deviation of the resulting data series in line with that of the original station data. I then ran RegEM on the result, repeating the experiment 20 times as the result depends on what particular sets of random data are generated, which varies each time. The results are revealing.

The average resulting trend in the (spliced) RegEM satellite reconstruction output was 0.1175 C/decade, amazingly close to the trend of 0.1173 using the original surface station data (itself virtually identical to Steig’s result). Such a close match had to be due to chance, as the standard deviation of the set of 20 trends was high at 0.126. With only very weak correlations existing in the data, RegEM has more difficulty converging and the final result is much affected by the pattern of very small correlations that exist between the random surface station data series and the three satellite PC series. Those correlations of course change from one set of random numbers to the next. The high frequency random fluctuations constitute almost all the variance of each surface station series, with the linear trend contributing only a tiny fraction of it, so it is not surprising that the resulting reconstruction trend is strongly affected by the particular set of random numbers used.

In case the average 0.1175 C/decade trend of the first 20 runs was completely unrepresentative, I ran RegEM another 120 times on more sets of random data. The average spliced reconstruction trend for all 140 runs was 0.136, with a standard error of 0.014. An average reconstruction trend of 0.136 is broadly similar to, although somewhat higher than, the trend of 0.117 computed with the actual surface station data. So the answer appears to be that the average reconstruction trend could easily be generated by replacing the surface station data with random data having the same trends as the actual data. I am not suggesting, of course, that the actual surface station data did not in fact exhibit genuine non-trend based correlations between surface stations and with the satellite data, but many of the correlations are weak and it seems likely that the sensitivity of RegEM to unrelated trends has given an upwards bias to Steig’s results.

#### Temporary detrending as an approach to overcoming RegEM’s excessive sensitivity to trends

I have shown that RegEM is very sensitive to the presence of trends in data even when that data is uncorrelated with the satellite PC series from which the reconstruction is generated. How could one avoid this sensitivity from distorting the reconstruction result? For it to make sense to use one data series to help impute values for another data series with missing values, in a quite different period from that in which both series have actual values, there must be at least one common factor that affects both series, to a greater or lesser (possibly even negative) extent, in a similar fashion over an extended period of time. Fluctuations in that factor will give rise to correlations between the HF (high frequency) “wiggles” in the two data time series. Assuming the common factor also induces trends in the two data series, that will additionally give rise to a low frequency (LF) correlation, but unless the HF correlation is extremely weak, or the trend over the time period for which data exists is large relative to the HF fluctuations, the total correlation will arise very largely from the HF correlation. There will be other factors that are not common to both series, some of which will cause essentially random fluctuations and others of which may impart a trend to the affected series (or add to or subtract from any trend arising from common factors). Trends imparted by factors that are not common to two data series have no predictive value when using data from one series to impute missing values in the other series. Therefore, one should seek to exclude from the measure of the correlation between two stations that due to any trend, except to the extent that it arises from common factors.

To exclude such correlations arising from unrelated trends, one could seek to work out from the data series’ HF correlations how much of the trend in each data series was due to common factors that also affected the other data series. However, it would not be easy to do so. Some form of high pass filtering could be used to smooth all the data series before working out their correlation matrix; that would perhaps be the best approach. However, I decided to use the very simple approach of completely detrending all the data series when deriving their correlation matrix. Where there is a genuine correlation between two data series, detrending will not greatly reduce the measured correlations, as the correlations between the high frequency wiggles in the data will dominate that arising from the common trend.

Once the final estimated correlation matrix has been iteratively generated within RegEM, the TTLS imputation B matrices derived from the correlation matrix are applied to the *original data*, with its trends restored, so as to produce the final reconstruction. So the only difference in the resulting reconstruction from before will be due to the *correlations* between the various data series having been calculated ignoring trends in the actual data contained in those series. Trends in data series will still be imputed through to missing data points of other series insofar as the correlations remain – as they very largely will where there is a genuine common factor relationship between two data series. In this way, detrending is used to achieve better estimates of underlying relationships between the data series, less distorted by unrelated trends, but rather than the trends being free parameters they are restored in the final imputation, which uses the original data series as its input. In order to explore this method I built a switch into my modified version of RegEM which, when selected, detrends all the data series before iteratively calculating the correlation matrix, and then restores the trends for the final imputation.

I do not suggest that linear detrending of all the data series when deriving the correlation matrix is always a good idea. When the data contains highly non-linear “trends”, it is not appropriate (although use of high pass filtering might be). From looking at toy data sets, I have found that in such cases the correlation matrix derived from detrended data is substantially different from that using the original data, and appears to be a less good representation of the underlying relationships. In this case, I found that detrending all 42 surface station and 3 satellite PC series changes almost all the correlations by very modest amounts. In the case of only two pairs of high trend surface stations (Great Wall – Rothera and Arturo_Pratt – Bellingshausen) did their squared correlation change by more than 0.025. No squared correlation between any surface station and any of the satellite PCs changes by as much as 0.01. So I tentatively concluded that detrending the data series would be appropriate in this case.

I carried out certain tests in order to verify whether or not using my detrending method improved the fit of the reconstructed values, particularly as regards the satellite PCs that are actually used for the spatial reconstruction and to derive the average reconstruction temperature trend. A straightforward test is to take the full original (trended) and the detrended RegEM reconstructions and to compare their fits between actual and fitted (unspliced solution) values, for all points where actual data exists, over the entire period. The fit can also be computed just for the satellite data, necessarily just over the 1982-2006 period The goodness of fit can reasonably be measured by rescaling all data series to unit standard deviation (as RegEM itself does) and summing the squares of the differences of actual and imputed values. The resulting sums of squares over respectively all, and just the satellite PC data series, were 7322.4 and 471.5 using the detrending method. The corresponding sums using the trended method were both higher, at 7392.7 and 474.1. So the detrending method gives improved results on this simple measure.

I carried out a similar test using a split calibration / verification periods approach, concentrating on the satellite PCs data series as it is their imputed values that determine the reconstruction. I omitted the first half of the satellite data from the RegEM input and compared the resulting imputed satellite PC values for the omitted period with the actual values. Then I repeated the exercise but omitting the second half of the satellite data. The resulting sums of squares of differences in the rescaled satellite PC data series for the two sub-periods were 381.4 and 372.4 using the detrending method. Again, the corresponding sums using the original, trended, method were both higher, at 384.0 and 377.0. The detrending method again gave improved results. The RE statistic for the fit of the imputed vs actual 3 PC reconstruction of the satellite data was also slightly higher in both cases using the detrending method.

As demonstrated above, Steig’s RegEM reconstruction exhibits a high sensitivity to unconnected trends in input data. This is particularly concerning in view of the high temperature trend in the satellite data, which does not appear to be in line with the surface station data. In the light of these concerns about trends, and given the above test results, I think that there are good grounds for believing that Steig’s main AVHRR reconstruction would be more accurate if the input data series were detrended for the purposes of RegEM iteratively deriving the correlation matrix, prior to using that matrix to calculate the B matrices used to impute, from the non-detrended actual data, the missing data. Using detrending only for the purposes of determining the correlation matrix avoids the dangers pointed to by von Storch *et al.*, whilst allowing trends in the satellite data to influence the final result (but not to completely determine it over 1982-2006).

The effect on the average reconstruction trend of detrending the data series for the purpose of deriving the correlation matrix is substantial. Without detrending, using RegEM on the satellite and surface station data archived respectively on Steig’s and the Climate Audit website gives an average 1957-2006 temperature anomaly trend for Antarctica of 0.1173 degrees C /decade. That is closely in line with Steig’s result of 0.1178; the small difference is probably because of changes in the surface station data between when it was downloaded by Steig and by Steve McIntyre. Using detrended data (all 45 series) to calculate the correlation matrix, the spliced trend is 0.0684. That is 0.0489 C/decade, or 42%, lower than in the non-detrended case. The unspliced trend is also substantially lower, at 0.0653 compared with 0.1007. (The error bands for all the trend estimates are clearly wide, but RegEM’s results are quite precise so I state calculated trends to 4 decimal places here.) The spatial distribution of the trends resulting from using detrended data to derive the correlation matrix is little changed from the results in Steig’s case.

The trend that I calculate of 0.0684 C /decade using detrended data to compute the correlation matrix should not be confused with the trend of 0.08 that Steig reports using detrended satellite data (I calculate 0.0785 using detrended satellite data, or 0.0864 on an unspliced basis). His result involves detrending only the satellite data but then uses that detrended data in the final calculation of the imputed (reconstructed) data as well as in computing the correlation matrix, whereas mine detrends all data series but uses the original, non-detrended, data in the final calculation of the imputed reconstruction. Using my method but with satellite data that had already been detrended (and whose trend therefore cannot be restored for use in the final imputation step) substantially further reduces the average (spliced) reconstruction trend, to 0.0378. The unspliced (RegEM solution) trend, which is mainly determined by the surface station data, is little changed at 0.0599.

Not all the 0.0684 C /decade trend that I compute using the detrending-for-correlations method can necessarily be regarded as reflecting temperature trends in the underlying data. For reasons that I do not yet fully understand, RegEM as applied to the 42 surface station and 3 satellite PCs data set produces a small positive 1957-2006 reconstruction trend (some 0.0054 spliced, 0.0041 unspliced) even if all the input data is detrended.

The impacts on the average reconstruction trend of my method of using detrended data within RegEM only whilst iteratively computing the correlation matrix (“for correlations only”), and of detrending the satellite PCs fully (prior to input to RegEM) with or without the surface station data being detrended for correlations, and of fully detrending all data, are summarized in Figure 2 below.

The key surface station(s)

Finally, I to turn to the question of what individual surface station has the most impact on Steig’s reconstruction – the subject of the puzzle set at the start of this article. The answer is a tie between McMurdo and Scott_Base, two Ross ice shelf stations (classified as East Antarctica by Steig). The 1957-2006 temperature trends at these two stations are above average but not particularly high, at 0.200 and 0.179 respectively. However, removal of either of these surface stations has a powerful, indeed destabilizing, effect: it reduces the spliced trend by 37%, from 0.1173 to 0.0743, and the unspliced trend by over 46%, from 0.1007 to 0.0537 (McMurdo) or 0.0540 (Scott_Base). Omitting both McMurdo and Scott_Base from RegEM further reduces the spliced trend to 0.0591 (50% below its original level), and the unspliced trend to 0.0381 (62% below its original level).

I recently carried out an investigation of the contribution of both individual surface stations and areas of Antarctica to the average reconstruction trend, both as reported by Steig on a spliced basis and as to the unspliced RegEM solution, by means of detrending station data series prior to input to RegEM. Detrending McMurdo and Scott_Base had relatively large impacts on the average reconstruction trend, at respectively 0.0151 / 0.0167 on the spliced trend and 0.0125 / 0.0138 on the unspliced trend. What I did not then realize was that although in most cases the effects of detrending a surface station data and removing it entirely from RegEM are similar, there are a few stations for which their removal has a much greater effect than detrending their data. These are McMurdo and Scott_Base, and to a considerably smaller extent Amundsen_Scott. Going the other way, taking Faraday out has a much smaller effect than detrending its data.

The spatial trend maps from the previous reconstructions are very similar to each other except for being generally lighter or darker, so I haven’t included them, but I think it is worth seeing what removing McMurdo from the reconstruction does to the spatial distribution, as well as the magnitude, of the trends. The lower maps (courtesy of Ryan O’s map plotting functions) in Figure 3 below, with McMurdo removed, show a quite different picture from the base Steigian reconstruction shown in the top maps. The LH maps show spliced 1957-2006 trends and RH maps unspliced trends. As can be seen, particularly in the unspliced result, removing McMurdo radically changes the spatial distribution of the reconstruction temperature trends as well as their magnitude.

Figure 3

Why does removing McMurdo and Scott_Base have such a major effect? Investigation of the v matrix of the svd (singular value decomposition) of the final RegEM correlation matrix C reveals that removing either McMurdo or Scott_Base has a drastic effect on the weights of satellite PC3 in the three eigenvectors retained in the TTLS calculations – indeed two of the three PC3 weights change signs. It also changes substantially the weights of various of the surface stations in the eigenvectors, which reduces greatly the trend conveyed by satellite PC1. It appears that McMurdo and Scott_Base act, *inter alia,* as some sort of glue holding satellite PC3 linked to the rest of the surface stations. Why that should be so is as yet unclear to me. Although McMurdo and Scott_Base have fairly strong correlations with all three satellite PCs, so do several other stations whose removal does not affect the average trend (nor presumably svd(C)$v) by very much. Even stranger, I found that doubling the weight given to Scott_Base in RegEM substantially reduces the average reconstruction trend, rather than increasing it as one would expect. Whatever the reasons, the drastic effects of removing either or both of McMurdo and Scott_Base appears to cast further doubt on the accuracy and reliability of Steig’s reconstruction and to indicate that in this case the RegEM method, at least as applied by Steig, is lacking in stability.

Steig’s choice of just 3 PCs for representing the 5509 grid cell AVHRR satellite data has been criticized, rightly in my view. It is in any event clear that, having made that choice, Steig’s decision also to set the RegEM regularization parameter (regpar) to 3, corresponding to the TTLS solution having 3 PCs, resulted in an ill-specified model that was unable to represent the 3 AVHRR PCs properly. Steig stated that principal component analysis of the surface station data “produces results similar to those of the satellite data analysis”, and that setting regpar=3 meant that variance in the peninsula was not fully captured. That does not seem to accord with the facts. Over one third of all the surface stations are located in the Antarctic peninsula (nearly one half if one includes the island stations), inevitably resulting in one of the TTLS solution PCs (in fact, PC1) being primarily compiled from peninsula surface station data. But the peninsula constitutes only a few percent of the 5509 AVHRR grid cells, too small a proportion to be represented to more than a modest extent in the 3 AVHRR satellite PCs available. The result is that only 2 of the TTLS solution PCs are fully available to represent the 3 AVHRR PCs, which they cannot do satisfactorily. Thus, far from the imputed satellite PCs 2 and 3 being orthogonal and so uncorrelated (as the post 1981 actual satellite PCs are), they have a correlation of over 0.9. Further, Steig’s model is unsatisfactory in that it lacks stability – as shown by the radical effects on its parameters of removing McMurdo or Scott_Base stations.

Both these defects can be rectified by setting regpar to 4, although the need for detrending in calculating correlations, and the shortcoming of only having 3 AVHRR PCs, remain. With regpar=4, 3 TTLS solution PCs are left to fit to the rest of the data after the peninsula station related correlations have been dealt with, allowing a reasonable, orthogonal, fit to be achieved for the 3 AVHRR PCs. The correlation between imputed AVHRR PCs 2 and 3 falls to below 0.01, whilst the 1982-2006 correlation between the imputed and actual AVHRR PCs 2 and 3 increase substantially (from 0.16 to 0.58 for PC3). Model stability is also much improved, with the removal of McMurdo or Scott_Base now having a relatively minor effect. The fitted temperature trends also fall by 0.10 C/decade on a non-detrended basis, to 0.1070 (spliced) and 0.0906 (unspliced), and by over 0.005 to 0.0630 (spliced) and 0.0591 (unspliced) detrending the data to determine the correlation matrix. Changing regpar from 3 to 4 results in the reconstruction displaying significantly lower warming from longitudes 30W to 30E, but somewhat more warming between longitudes 60E and 150E. Maps comparing the spatial distribution of temperature trends using regpar=3 (LH map) and regpar=4 (RH map), with data detrended when computing the correlation matrix, are shown in Figure 4 below.

Figure 4

#### Conclusion

For the various reasons set out above, I consider the unspliced regpar=4 result using detrended data to compute the correlation matrix, with its average trend of 0.0653 – 0.052 C/decade or 44% lower than Steig’s main (spliced) result – to be a considerably better representation of trends in Antarctic near surface temperatures over 1957-2006, based on 3 AVHRR PCs, than Steig’s. Both results are constrained by the use of only 3 AVHRR PCs, although that is probably of more significance to the spatial distribution of temperature trends than to the overall average trend, and by the use of unweighted data series.

Nic L

July 14, 2009

UPDATE: ————-

—————

—————

Link to original word document with formatting and code used to generate this post.

The effect of surface station trends on Steig’s reconstruction5

## Layman Lurker said

Nic L, you have delved into the area that I have been curious about for several months. Thanks for this effort.

Interesting that you mention the effect that McMurdo and Scott have on Steig’s reconstruction and specifically on PC3. Jeff used a comparison of Scott and Arturo as an example of two stations with about the same linear trend but with a negative covariance relationship: https://noconsensus.files.wordpress.com/2009/04/covariance-plot.jpg

Given the modes of the PC’s, it seems understandable that negatively correlated stations should play into the reconstructed trend and that the method is sensitive to removing any stations with negative correlation to the peninsula stations.

With respect to the impact on PC3, this is the PC which is the obvious un-natural artifact in Steig’s analysis as shown here: https://noconsensus.files.wordpress.com/2009/02/three-main-pcs2.jpg

Could you plot the PC3 with removal of Scott and or McMurdo out of curiosity?

## Nic L said

Layman Lurker,

Good question about the impact on PC3. I have plotted satellite PC3 with Scott_Base removed, but I can’t see how to insert it in this reply. In fact, PC3 looks much the same with Scott_Base removed as with it included, except for the sign reversing, with the amplitude of variations in the imputed values (pre 1982 only in the spliced solution) being much lower than the variations in PC3 based on the actual post 1982 data. However, if regpar is set to 4 then the imputed PC3 variations have a broadly similar amplitude to variations in the post 1982 actual PC3. I think that this is further evidence that Steig’s choice of regpar=3 was an unsuitable one.

## Jeff Id said

Nic, if you email it, I’ll put it in the post above.

## Ryan O said

Nic and Jeff,

I’ll email this to Jeff tomorrow, but I was rewriting the imputation algorithms to combine them into one neat package and I found a mistake in Steve M’s PTTLS script. It’s a pretty significant one. Steve’s version doesn’t update the covariance matrix with the estimated residuals covariance matrix. That’s why Jeff and I got slightly different answers way back when we were comparing Matlab TTLS to R TTLS. While it gives the same answers at low regpars, the effects when you remove stations or detrend stations may be different.

## Nic L said

#3 Jeff

Thanks. I have emailed a jpg of satellite PC3 (spliced) to you, showing 3 graphs. The top one is the base (Steig) case, the middle one as before but with Scott_Base removed,and the bottom one includes Scott_BAse but has regpar changed from 3 to 4.

## Jeff Id said

#4 I’m glad you found it because I still don’t see it.

## Nic L said

#4 Ryan

You are absolutely right, Steve’s version doesn’t appear to implement the “C = (X’*X + CovRes)/dofC” line in the Matlab version correctly. Good spot. Very odd. I am not sure how much difference that makes, but CovRes doesn’t become very small as RegEM converges so it could be a significant difference.

I don’t think that Steve’s version implements the “% add mean to centered data matrix

X = X + repmat(M, n, 1)” line at the end of the Matlab version, to restore the mean of the original X matrix, either. Nor does it implement the inflation factor, OPTIONS.inflation, although that is by default set to unity and no recommendation is given as to any higher figure to set it to.

Could I ask you to copy your rewrite of the RegEM algorithm to me at erans99-3 #at# yahoo.co.uk when you send it to Jeff, please? I plan to update my modified version of the RegEM script for Steve’s omission, and I want to ensure that it is fully compatible with your rewrite.

## Kenneth Fritsch said

Nic L, thanks for the efforts in doing this post and explaining well what it is you are attempting to show. I hope to me able to ask an intelligent question (or at least not stupid question) or two after I have digested your post.

I need to ask you to expound on why one would not detrend the data to compuite the correlation matrix.

## Nic L said

#8 Kenneth

None of the academic work that I have looked at discusses detrending just for the purposes of calculating the correlation matrix, so there may be some reason not to do so that I am unaware of. Mann, Wahl, Ammann et all argue strongly, with examples, that detrending the data before using RegEM substantially reduces the skill/ accuracy of the resulting reconstruction. However, they do not address the possibility of detrending the data only for the purposes of deriving the correlation matrix, so I do not think that their criticisms (even if valid, which is debated by others) are relevant here.

It is certainly possible to find data for which detrending when calculating the correlation matrix worsens the reconstruction. A trivial example with two data series is the following. Series 1 consists of a HF sine or triangular wave that trends down in the first half of the data series and trends equally up in the second half. Series 2 consists of the first half of series 1, perhaps with a bit of noise added, but has no data in the second half. Without detrending, RegEM will have no difficulty recognising that the two series are essentially perfectly correlated in the first half, and will impute the missing values of series 2 in the second half as equal to the actual values of series 1, which makes sense. If both series are detrended, series 1 is unaffected – a V shaped series has no linear trend over the whole period – but series 2 (which only has data for the downwards slope of the V) has its downward trend removed. It will then correlate much less well with series 1 than before and its reconstruction, reflecting that, will be a much less good copy of series 1. In theory, this could be avoided by detrending series 1 only over the period when it had common data with series 2, but that approach would become awkward when dealing with multiple data series whose periods of missing data differed.

In practice, it may be that data series whose trends reverse during the course of the reconstruction will not be encountered very often, but they could be. That is why I thought some form of high pass filtering might be better than completely detrending.

Someone who is stronger on statistics than I am may be able to comment further, which would be very welcome.

## Ryan O said

#7 I had previously added the centering step. The version that Jeff should be sending you has the inflate parameter added and fixes CovRes.

## Ryan O said

One other thing. All of these SVD-based methods have problems with sign flipping. The solution is not always stable when you jump from n EOFs retained to n+1. The iterative methods produce a more stable solution because they establish the sign of the correlation early and use that estimate as a starting point when n is increased. In the version of RegEM Jeff sent you, there is an option to set RegEM to iterative. I would be interested in seeing if the sign flip in PC3 still occurs when you remove Scott Base and McMurdo in an iterative imputation.

## Jeff Id said

Ryan,

I understand your point about the iteration picking correlation early on. The whole process is amazingly similar to how we calculate optics at my company. We don’t use PCA but the strength of the weightings can inadvertantly determine a lot about the final solution in an undefined system and it happens early on in the iteration. We spent a lot of time stabilizing the iterations.

I’m curious though why you are interested in a sign flip in the PC? I read it as almost a random event which is corrected by an inverse pc weighting (different from a station weight). My current understanding of a sign flip in a PC is that it is a non-event so perhaps I’m missing something. . . .again 🙂

## Ryan O said

The sign flip of the PC doesn’t invert the eigenvector. The reconstructed temperature contribution from that EOF is simply the PC * eigenvector. So if the PC sign flips – and the eigenvector doesn’t (remember, we keep the eigenvectors static and simply multiply them through after the PCs are infilled) – then it inverts the entire temperature contribution from that PC.

You could prevent this by imposing an external condition that watches for the sign flips and, when they happen, inverts the eigenvector, but unless you do this, the sign flip drastically changes reconstructed temperatures. Hence the “Which Antarctica Do You Want” panel.

## Jeff Id said

Are you certain about the PC flip without the weighting flip?

The choose which Antarctica you want post looked like over fitting rather than pc flipping to me. Perhaps due to the problem you recently found with the SteveM algorithm. I’m wrong often but in the case where I saw a pc flip in the AWS recons early on, it made little difference.

## Jeff Id said

Actually overfitting may be the wrong word from what I was thinking at the time. At the time I would have called it a bad convergence or unconstrained convergence but thats because of my own endless experience with iterative algorithms outside of academia.

## Ryan O said

#14 Yes. The temperature is recovered by simply multiplying the PC by the spatial eigenvector. The sign of the spatial eigenvector is static. So if the PC flips, the temperature contribution flips.

The reconstructed temperatures are computed directly from: U S V*. The PCs are simply (S V*)*. So if the sign of S V* flips, there’s no math anywhere to tell the associated column in U to flip.

In the case of the AWS recons, remember that you do the reconstruction explicitly from station information, and THEN calculate SVD of the resulting reconstruction. In that case, a sign flip in a PC would be accompanied by a sign flip in the U eigenvector. For the satellite reconstructions, though, U is determined

beforethe infilling, so any sign flips during infilling are directly carried through.If you want to prove it to yourself, after doing a reconstruction, go back in and change the sign of one of the PCs and recalculate U S V*. The temperature contribution from that PC will indeed be inverted.

## Jeff Id said

I’m going to have to go back but in the case where I saw a PC flip the eigenvector flipped as well.

Certainly if the eigenvector doesn’t flip, there’s a problem but my understanding is that PCA prevents that as it would have a completely different meaning.

Calculating USV would certainly reveal a difference if the eigenvector didn’t flip. I haven’t hand caluclated a PCA so the methodology isn’t perfectly clear. My point is that I’m talking about something more subtle than whether the multiplication of non-flipped signs would matter.

PC 2 and up are a difference between datasets. If the difference is flipped along with the sign of he difference, there is not much change. I don’t understand why PC2 picks one half of autocorrelated antarctic as positive and the other as negative.

## Ryan O said

I think it will be clearer if I do this in R-speak. For the reconstruction, the first thing we do is take the SVD of the AVHRR data:

svd(avhrr) = $u, $d, $v

We then take the first N EOFs and make PCs:

PCs = t(diag($d) %*% $v)

Now we take those PCs and infill back to 1957. To make the reconstruction, we multiply the left-hand eigenvectors through:

recon = $u %*% t(PCs)

Nothing in the reconstruction touched anything in $u. If we are using the spliced PCs like Steig, nothing touched the 1982-2006 period, either. So we can’t just change the sign on $u – otherwise we will invert the 1982-2006 period. That means if the 1957-1982 portion of the PC flips, the temperature contribution flips as well. That was what was happening in the “Which Antarctica” plots.

The AWS recons are different. For the AWS recons, first we impute. Then we have a completed station matrix X. So to get the PCs, we take the SVD of X:

SVD(X) = $u, $d, $v

PCs = t(diag($d) %*% $v)

But now we don’t have to impute. And if we multiply the left hand eigenvector through, we simply get back to X:

X = $u %*% t(PCs)

In this case, if the sign of $v flips, there will be a corresponding flip in $u as well. The reason is that we don’t touch the data after the SVD. All of the manipulation of the data is prior to the SVD. While it is true that small differences from recon to recon will cause the sign of $v to change, there is always a corresponding change in $u.

However, when you manipulate the PC

aftercalculating $u – like we do in the satellite recon – sign changes in the PC willnotresult in sign changes in $u. And if you do change the sign of $u, you absolutely cannot use the spliced solution because you will invert the spliced part.## Jeff Id said

I completely understand what you’re saying with regards to the spliced version – sorry about that. IMO, the spliced version makes less sense than institutional health care so I missed it.

The whole spliced concept doesn’t work because the equation c1* t1 = c2 * t2 …..= T doesn’t have the independent slope modifying factor which is an absolute requirement IMO to make it a reasonable possibility.

## Nic L said

#11 Ryan

You wrote “I would be interested in seeing if the sign flip in PC3 still occurs when you remove Scott Base and McMurdo in an iterative imputation.”

Now that I have your iterative script with the CovRes problem fixed I am able to answer this question (using TSVD; I don’t think iterative applies for TTLS).

It makes no difference to the PC3 sign changing whether iterative or non-iterative is selected. Note that it was not the sign of AVHRR PC3 itself that I said was flipping, rather the sign of its weights (values) in 2 of the 3 retained eigenvectors in the TTLS calculations. Most other data series did not see their signs changing, so it is not a case of two flips cancelling out. But the sign of AVHRR PC3 does not necessarily change, as it is determined by the values in all 3 retained eigenvectors. I think the problem is that the reconstruction solution is pretty unstable when only 3 eigenvectors are retained (and 3 AVHRR PCs are used), and removing McMurdo and/or Scott Base tips the solution over the edge – the reconstruction in effect breaks down.

## Ryan O said

Ah. I misunderstood what you meant. I would suspect that is more due to a reordering of the TTLS PCs. With Scott Base and McMurdo removed, other stations get represented in the low-order EOFs and the Ross Ice Shelf stations become under-represented. That’s probably why when you go to regpar=4 the solution doesn’t depend on them as much.

By the way, in the new version of the algorithm you have, iterative works for TTLS as well. So does eigenvector weighting. The TTLS part now has all the same options as the TSVD.

## Ryan O said

Our RegEM TTLS is now TTLS on steroids. Or crack. Not sure which. Haha. 🙂

BTW . . . if you get a moment, Jeff, would you be able to compare the Matlab version with the new R version? The options to set to make sure it runs the same as Matlab are:

ttls=TRUE, covr=FALSE, iter=FALSE, cntr=2

## Jeff Id said

I will give it a shot. I’m struggling with area weighting a bit. First attempt gave a bunch of junk.

## Ryan O said

You should have seen my first attempt at doing an EM for imputing using satellite covariance.

## Jeff Id said

You have too much time on your hands. ha ha

I did something which looks like area weighting but now I’ve noticed my 28 pc recon normal style doesn’t match your plots. My first impression of the result which is completely unverified and VERY subject to revision – not much difference. Wouldn’t that be cool!

## Nic L said

#21 Ryan,

I agree that removing McMurdo and/or Scott_Base changes how strongly other stations are represented in TTLS PCs/EOFs 1-3. Judging by station weights in svd(C)$v, the main changes are in TTLS PCs 2 & 3. However, I think the exact reasons for the effect of removing McMurdo and/or Scott_Base are not that straightforward to understand. Removing both stations reduces the slope of TTLS PC2(EOF2) by about 25%. That is what, mainly via its effect on AVHRR PC1 (which carries much the highest weight) has the most effect towards reducing the average reconstruction trend (the unspliced trend with RegEM fixed for the CovRes problem goes from 0.0936 to 0.0409 on removing both stations). Their removal also hugely changes the weights of AVHRR PC3 on all the TTLS PCs (reversing the sign of two of them), but that only reduces the average reconstruction trend by about 0.011. McMurdo and Scott_Base seem to be the only long record stations with a significant -ve correlation with AVHRR PCs 1&2 and a significantly +ve correlation with AVHRR PC3 (on actual data), which I suspect is, in conjunction with their high 1957-1981 temperature trend, is the key to their peculiarly powerful effect on the reconstruction. Why increasing their weights also reduces the average reconstruction trend I do find difficult to understand. I have found that changing the weights of some other stations by relatively modest amounts also has extremely non-linear effects on the average reconstruction trend.

## Nic L said

#7

I have now updated my script to correct the CovRes error in Steve M’s RegEM script spotted by Ryan O and rerun everything. None of the results changes materially, other than that in all cases the trends now come out a bit lower, and the effect of removing McMurdo and Scott_Base is very slightly less dramatic than before. All the conclusions stand. I am emailing Jeff a copy of the revised note and asking him to add a link to the Word document.

## Ryan O said

#27 Excellent!

## Nic L said

One thing that surprises me about the corrected base case reconstruction is how much lower the spliced trend is (at 0.111 C/decade, the same result as using Ryan’s corrected reconstruction script) than Steig’s result of 0.118. I have now run the corrected regem using the actual PCA’d satellite data that Steig seems to have used, as derived from prcomp on his archived reconstruction from 1982-2006, along with Steve M’s archived surface station data. The result is a spliced trend of 0.120, close to Steig’s result. Changes in surface station data between Steig downloading it and Steve M archiving it could account for the remaining difference.

What this appears to show is that the cloud masked AVHRR satellite data as archived by Steig is significantly different from the satellite data that he actually used in his main satellite based reconstruction.

## Jeff Id said

#29,

I don’t believe they are the same dataset and was suspicious since the reconstruction PC’s don’t match the sat PC’s very well. Don’t forget, the data took a long time to release after the paper was published.

## Layman Lurker said

Nic, Jeff….Is it reasonable to speculate that the processed data vis a vis cloud masking (which dominates the AVHRR data as used by Steig) could degrade the true HF correlations? If so this would seem to have obvious implications for Steig’s results given the sensitivity shown by Nic.