the Air Vent

Roman M – PCA Deconstruction – an unauthorized biography.

Roman M. Did an interesting analysis over at CA today deconstructing the reconstructed temperatures in the Antarctic paper titled Warming of the Antarctic ice-sheet surface since the 1957 International Geophysical Year:

Deconstructing the Steig AWS Reconstruction

The paper demonstrates extraordinary warming in the Antarctic.  The entire exercise was brought about because of Dr. Eric Steig not releasing his code despite his unreasonable claims. Roman’s post back-calculated the pca’s use in the Automatic Weather Station version of the reconstructed temperature. What we need to understand is that weather stations in the antarctic are frequently buried and cease working.

Here is a graph of the available temperature series when they were actually reporting data. This is a plot of the peninsula stations but it is similar to the automatic stations in data availability.

The PCA analysis by Roman allowed him to determine the shape of the interpolated curves which were used to infill the gaps in the data to give continuous series for their final result.

Step 1.

We begin by using a principal component analysis (and good old R) on the truncated sequences. For our analysis, we will need three variables previously defined by Steve Mc.: Data, Info, and recon_aws. In the scripts, I also include some optional plots which are not run automatically. Simply remove the # sign to run them.

There are 63 eigenvalues in the PCA. The fourth largest one is virtually zero. This makes it very clear that the reconstructed values are a simple linear combination of only three sequences, presumably calculated by the RegEM machine. The sequences are not unique (which does not matter for our purposes).

PCA = Principal Components Analysis, a method for determining the primary trends which comprise a set of curves.

Roman looked at the reconstructed curves (part of the data we are allowed to see from Dr. Steig) starting in the area where there was no actual temperature data 1957 – 1979. This data was never measured and is actually imputed or interpolated from other information. This data isn’t real but may be related to real data in a reasonable fashon.

He found that there were only 3 actual curves representing Antarctica in the automatic weather station reconstruction. Although many were surprised we shouldn’t have been. The paper actually states that’s what should be expected. My suspicion is that Roman read this from the beginning.

Principal component analysis of the weather station data produces results similar to those of the satellite data analysis, yielding three separable principal components. We therefore used the RegEM algorithm with a cut-off parameter k=3. A disadvantage of excluding higher-order terms (k>3) is that this fails to fully capture the variance in the Antarctic Peninsula region.We accept this tradeoff because the Peninsula is already the best-observed region of the Antarctic.

Well his analysis revealed exactly what the paper said. —— SM:this gives coefficients for all 63 reconstructions in terms of the 3 PCs

Three curves multiplied by a constant and added together create ALL of the trends in 63 series.

Roman then continued the analysis since he had done the first part for areas where there was no data at all.

We will assume that exactly three “PCs” were used with the same combining coefficients as in the early period. The solution is then to find intervals in 1980 to 2006 time range where we have three (or more) sites which do not have any actual measurements during that particular interval.

Again he does something pretty sharp, realizing that his first analysis revealed that 3 trends make the total but he doesn’t have any of the artificial data trend outside of his year range 1957 – 1979 yet, he looks for at least 3 curves which don’t have data in the remaining series. This is basic algebra – 3 equations and 3 unknowns applied at a complex level for us mere humans.

His PCA analysis again reveals 3 series of data which when multiplied by their coefficient and added together match the artificial data perfectly. What’s more the same multipliers for each series were found. — The first analysis revealed a multiplier for this range 1957 – 1979 in all 63 series and the second analysis on a different range of artificial data revealed the same multiplier for 3 series. These 3 series gave him the shapes of the three curves from 1980 – 1995. Repeating the analysis again gave him the shape of the PCA’s for years 1996-2006.

He then verified all of the curves compared to Steig et. team for the artificial portion of the data to about a millionth of a degree, so let’s say it matched pretty close.

Well if you are like me you ask Jeff, show me the artificial data. I want to know what they look like.

Don’t be too perturbed by the uptrend. It is actually equally as likely to be a downtrend in the reconstructed temperatures and each set of real data is infilled by these curves according to this

GAP IN TREND = PCA 1 * C1 + PCA 2 * C2 + PCA 3 * C3

Steve M then figured out this interesting tidbit.

The meaning of this graph is a little difficult for those who aren’t mathematically inclined. This means that the PC2 curve separates the two halves of the antarctic. The red stuff gets a positive PC2 multiplier and added into the curve while the blue stuff get’s a negative pc2 multiplier. Only one trend separates out the halves of the Antarctic.

Conclusions for the automatic weather station reconstruction – only one of the group:

1. For those who don’t like math the new assumed data has a trend up or down depending on the multiplier.

2. The PC2 trend separates the two halves of the antarctic.

3. Three curves are used to define the entire trend of the antarctic.

4. Roman is smart.

Three series of assumed data to reconstruct the antarctic seems far too small in my opinion. The antarctic is huge, how can three trends cut it? Well, Igot a bit more skeptical of this paper this morning while reading through the SI. I’m trying to keep a level head about it. After all, I was told by a colleague today that if we don’t react now the ice cap will vaporize, the earths axis will tilt and Washington DC will flood. We wouldn’t want that, would we?

Maybe I’ll sleep in.