The Three PC’s of the Antarctic
Posted by Jeff Id on February 18, 2009
Tonight I was going to replicate the Antarctic reconstruction for the AWS temperature using the gridded data from Jeff C. My goal was to do the same for the satellite series and run a reconstruction using Jeff C’s regridded dataset.
I needed the PC’s and weightings for the 5509 satellite series but ran into a small problem. Principal components analysis can’t work with more columns than row’s. The satellite data consists of 5509 columns with 600 months of rows each as presented by Dr. Steig on his webpage. The PC algorithm is overdefined – more equations than unknowns.
Well I wanted to do my analysis of 3 pc’s against JeffC”s surface station data so I needed the 3 PC’s. I ran the first 600 columns through pc analysis and found 3 reasonable looking pc’s . I needed the weighting for the rest of the series so I had to do it for all of them so I ran them 600 at a time for all 5509 series. I then checked to make sure that the PC’s from one set of 600 were the same as another. . . . They weren’t. I had 5509/600 =9.18 sets of data which was represented by 10 different PC runs. Each run had it’s own 3 curves!!
After a few minutes I decided that the final set of PC’s would be the PC’s of all the 30 curves. These 3 curves would represent the total satellite data. So I ran an analysis of the PC’s on the final 30 curves.
Here are the SD values of the calc.
9.260698e+01 4.020218e+01 2.877281e+01 3.401857e-07
After the 3rd PC all the SD’s are basically zero meaning that there are only 3 curves for the entirety of the satellite data as RomanM had already stated. So that worked but what do the PC’s look like…..
Note PC3.. These 3 trends should not show a difference for the pre 1982 pre satellite data compared to the post 1982. The pre 1982 data is from RegEM and is imputed. A good calculation would match the pre 1980 data well.
These are 3 temperature curves used to reconstruct the continent of Antarctica. The same curves used to make the graphs on the cover of Nature. The same data used to sort out by correlation which stations get which weighting from the data.
These curves have a problem.
I didn’t believe it at first yet this effect is something Roman M already noticed in a climate audit post where he plotted the ratios of pre to post satellite data from the info provided by Dr. Steig.
There are what appear to be missing bar’s in this plot but it is actually an artifact of how the graphics are displayed and all ranges have bars. What is important to note is that the variation (think peak to peak amplitude) of the data prior to 1982 is less than the data after 1982. No values are greater than 1. This is a basic confirmation of my 3 PC’s by a second party.
What does this mean? Well if proven out it means that the 3 satellite PC’s are not usable for a reconstruction and the picture on the cover of Nature is in need of a serious rework.
Sorry for the brute force R code but I spent the night considering other things.
#remove comment if you don’t already have the file
#scan in satellite data
grid=scan(“ant_recon.txt”,n= -1) # 37800
### calculate pc’s for each 600×600 matrix
satpc0 = princomp(gridts[,1:600])
satpc1 = princomp(gridts[,601:1200])
satpc2 = princomp(gridts[,1201:1800])
satpc3 = princomp(gridts[,1801:2400])
satpc4 = princomp(gridts[,2401:3000])
satpc5 = princomp(gridts[,3001:3600])
satpc6 = princomp(gridts[,3601:4200])
satpc7 = princomp(gridts[,4201:4800])
satpc8 = princomp(gridts[,4801:5400])
satpc9 = princomp(gridts[,5401:5509])
### add pc’s together into a 30×600 matrix
plot(tpc,main=”Three PC’s for Steig09 Satellite Trend”)
savePlot(“C:/agw/antarctic paper/three main PC’s.jpg”,type=”jpg”)