# the Air Vent

## Because the world needs another opinion

• JeffId1 at gmail dot com

Everything you need to know about Climategate. Now available at Amazon.com

• ## Subscribe to Blog via Email

Join 187 other followers

## The Three PC’s of the Antarctic

Posted by Jeff Id on February 18, 2009

Tonight I was going to replicate the Antarctic reconstruction for the AWS temperature using the gridded data from Jeff C. My goal was to do the same for the satellite series and run a reconstruction using Jeff C’s regridded dataset.

I needed the PC’s and weightings for the 5509 satellite series but ran into a small problem. Principal components analysis can’t work with more columns than row’s. The satellite data consists of 5509 columns with 600 months of rows each as presented by Dr. Steig on his webpage. The PC algorithm is overdefined – more equations than unknowns.

Well I wanted to do my analysis of 3 pc’s against JeffC”s surface station data so I needed the 3 PC’s. I ran the first 600 columns through pc analysis and found 3 reasonable looking pc’s . I needed the weighting for the rest of the series so I had to do it for all of them so I ran them 600 at a time for all 5509 series. I then checked to make sure that the PC’s from one set of 600 were the same as another. . . . They weren’t. I had 5509/600 =9.18 sets of data which was represented by 10 different PC runs. Each run had it’s own 3 curves!!

After a few minutes I decided that the final set of PC’s would be the PC’s of all the 30 curves. These 3 curves would represent the total satellite data. So I ran an analysis of the PC’s on the final 30 curves.

Here are the SD values of the calc.

9.260698e+01 4.020218e+01 2.877281e+01 3.401857e-07

After the 3rd PC all the SD’s are basically zero meaning that there are only 3 curves for the entirety of the satellite data as RomanM had already stated. So that worked but what do the PC’s look like…..

Note PC3.. These 3 trends should not show a difference for the pre 1982 pre satellite data compared to the post 1982. The pre 1982 data is from RegEM and is imputed. A good calculation would match the pre 1980 data well.

These are 3 temperature curves used to reconstruct the continent of Antarctica. The same curves used to make the graphs on the cover of Nature. The same data used to sort out by correlation which stations get which weighting from the data.

These curves have a problem.

.

I didn’t believe it at first yet this effect is something Roman M already noticed in a climate audit post where he plotted the ratios of pre to post satellite data from the info provided by Dr. Steig.

There are what appear to be missing bar’s in this plot but it is actually an artifact of how the graphics are displayed and all ranges have bars. What is important to note is that the variation (think peak to peak amplitude) of the data prior to 1982 is less than the data after 1982. No values are greater than 1. This is a basic confirmation of my 3 PC’s by a second party.

What does this mean? Well if proven out it means that the 3 satellite PC’s are not usable for a reconstruction and the picture on the cover of Nature is in need of a serious rework.

—-

Sorry for the brute force R code but I spent the night considering other things.

#remove comment if you don’t already have the file

#scan in satellite data
grid=scan(“ant_recon.txt”,n= -1) # 37800
dim(grid)=c(5509,600)
grid=t(grid)

gridts=ts(grid,start=1957,deltat=1/12)

### calculate pc’s for each 600×600 matrix

satpc0 = princomp(gridts[,1:600])
satpc1 = princomp(gridts[,601:1200])
satpc2 = princomp(gridts[,1201:1800])
satpc3 = princomp(gridts[,1801:2400])
satpc4 = princomp(gridts[,2401:3000])
satpc5 = princomp(gridts[,3001:3600])
satpc6 = princomp(gridts[,3601:4200])
satpc7 = princomp(gridts[,4201:4800])
satpc8 = princomp(gridts[,4801:5400])
satpc9 = princomp(gridts[,5401:5509])

### add pc’s together into a 30×600 matrix

m=c(satpc0\$score[,1:3],satpc1\$score[,1:3],satpc2\$score[,1:3],satpc3\$score[,1:3],satpc4\$score[,1:3],satpc5\$score[,1:3],satpc6\$score[,1:3],satpc7\$score[,1:3],satpc8\$score[,1:3],satpc9\$score[,1:3])

dim(m)=c(600,30)
threemainpc=princomp(m)

### plot

tpc=ts(threemainpc\$scores[,1:3],start=1957,deltat=1/12)
dimnames(tpc)[[2]]=list(“PC1″,”PC2″,”PC3″)
plot(tpc,main=”Three PC’s for Steig09 Satellite Trend”)
savePlot(“C:/agw/antarctic paper/three main PC’s.jpg”,type=”jpg”)

1. ### jeezsaid

FYI

The “appearance” of missing bars in this case is not an artifact of the browser display. I downloaded the jpg and opened it in image editing software to make sure.

This is a problem with the output from R, perhaps a setting, or is an artifact of post output processing, such as bilinear scaling. The current graphic is 586×594. Was that the intended output size? If you use any image editing software to resize your image, always use bicubic scaling.

Woof!
====

3. ### Jeff Idsaid

#1

In this case it is actually a problem in the JPG, I did a screengrab from roman’s thread on CA so the bars here are really missing. I’ll fix it from work where I have a better machine.

Did you notice PC#3 in the middle graph? This is one of a total of 3 actual curves supposedly representing temperature from satellites for the antarctic :).

4. ### Craig Loehlesaid

Jeff, on CA you mentioned that you are “just” an engineer, well if so, then Steig needed an engineer on his “team” I think. If this checks out in the light of a new day, then there really is a problem with the reconstruction. The question is how do you present this in a way that is clearly understood?

5. ### David Jaysaid

#2

But I thought the clue was the dog that DIDN’T bark (Sherlock Holmes, “Silver Blaze”).

6. ### RomanMsaid

Nice work, Jeff.

jeez is wrong. The grpah that I linked to (on my blog) to put it on CA was perfectly fine. If you would rather have a good copy (original size: 672 x 671 pixels) to post up there, you can find it here:

The graph was resized along the way (probably by the CA website to make it smaller producing the distortion.

After a few minutes I decided that the final set of PC’s would be the PC’s of all the 30 curves. These 3 curves would represent the total satellite data. So I ran an analysis of the PC’s on the final 30 curves

Good move. The 10 sets of 3PCs should all be rotations of the 3PCs which were used to construct the data. Each set of eigenvalues would possibly differ from those in another set, but the sum of the eigenvalues would be constant for every triple of PCs. By doing a PC on the combined 30 PCs, it should produce the set with the maximum possible eigenvalue attached to the first PC, etc.

7. ### Jeff Idsaid

Thanks, I updated the graphic.

8. ### Jeff Idsaid

#4 I thought it would get a bit more of a reaction than it did. You just never know how people are going to react. When I look at these PC’s I see a broken paper, other people seem to think of it as a curiosity.

Here’s my take:

There’s absolutely no way that these PC’s can possibly represent temperature in the antarctic. This is the main satellite data used for the conclusions on the cover of Nature. I have to admit that I was fooled by it initially because I thought it would be more reasonable than this. I believed them, now I know why they don’t want to show the data. After today, I think they should be considering a retraction of the article.

Perhaps they could ship a million erasers with the next monthly issue of Nature or a sticky cover to place over the old ones.

9. ### John F. Pittmansaid

Jeff, you state “”There’s absolutely no way that these PC’s can possibly represent temperature in the antarctic””. Want a bit more reaction. State specifically why it is absolute. I see the three PC’s and the SD’s and think that I don’t know how RegEM works, so perhaps such wide SD’s and the PC3 pre-1982 may not be a paper killer.

10. ### Jeff Idsaid

John,

I understand your point as well as Dr. Loehle’s, people don’t know what it means. Hell I barely do and I did it.

The RegEM is the process used to derrive the pre 1982 data 3 curves PC’s. These curves are weighted (multiplied) and added together (1 + 2 + 3) to make up the temperature graph at a point on the map. Each point has 3 different weights for 3 different curves. I believe the PC’s match well post 1982 because Steig has the data (which he won’t share yet). The pre 1982 data which is the crux of the paper needs to match the post 1982 data to be accurate. There’s no way temps changed as much as is shown in the 3rd PC, it ain’t natural.

If the 3rd pc has a huge pre 1982 step in the data, how can the pre 1982 imputed (infilled) data match anything post 1982? It clearly doesn’t, so what is this mess?

There are some subtleties which I can’t figure out how to explain easily. This also means to me that stations in the peninsula are more likely to have their trends (postive and inverted) applied across the whole continent. All the variance moved into the first two PC’s.

11. ### Terrysaid

Very nice work Jeff – have you gotten any feedback from Eric, Gavin, etc?

12. ### Layman Lurkersaid

Jeff, what is your take on this 1982 step? Data not properly inputed into ReGem? Spatial / temporal bias?

13. ### WhyNotsaid

There could be another reason, simply the number of ground stations used to collect data was mininal as compared to post 1982. If this is the case wouldn’t the PC3 curve over define the problem and/or represent error/error magnitudes due to the infilling process? I am only guessing (I don’t know how RegEm works) but if you were to take the post 1982 satellite data and generate 4 PCs, would the 4th PC look like PC3 pre-1982? This is probably a really naive post, sorry 🙂

14. ### Jeff Idsaid

#12 Sorry not to answer for so long, the last couple of days have been difficult. I think it might be an artifact of heavily repeating the same pattern, (in this case the peninsula) but I am just guessing.

#13 could be right too, there are a lot more stations post 1982.

15. ### zhengweisaid

And in 1957,the stainless animate Jewelry Watches were released. In 1957,the new movement calibre 1065 was introduced,it was abundant lighter and abundant slimmer than its antecedent authoritative the old domed aback obsolete. 1962:one of the a lot of accepted watch Rolex has anytime produced.