## Retrended RegEM Recon

Posted by Jeff Id on April 20, 2009

Well most know I think RegEM in the case of Steig’s Antarctic paper is scrambling trends based on correlation. This weekend I had a great idea, what if we re-trend all the data to a known value and run RegEM. Well I’ve always been a bit of trouble so I retrended it to a negative value of -0.2 C/Decade and ran Steig et al’s RegEM.

Since the code for the reconstruction was posted earlier, here’s the new section used for retrending data.

#retrend satellite data

newtrend=-0.02

deltaval=(1:300)/12

newsatanom=array(0,dim=c(300,5509))for(i in 1:5509)

{

slope=coef(lm(satraw_anom[,i]~time(satraw_anom)))[2]

newsatanom[,i]=satraw_anom[,i] + (newtrend-slope)*deltaval

}coef(lm((newsatanom[,1])~time(satraw_anom[,1])))

newsatanom=scale(newsatanom,center=TRUE,scale=FALSE)#center mean values

newsatanom=ts(newsatanom,start=1982,deltat=1/12)####################################################

####################################################

#repeat for surface stations

newsurfanom=array(0,dim=c(600,42))

deltaval=(1:600)/12for(i in 1:42)

{

slope=coef(lm(anomalies$surface[,i]~time(anomalies$surface)))[2]

newsurfanom[,i]=anomalies$surface[,i] + (newtrend-slope)*deltaval

}

coef(lm((newsurfanom[,1])~time(anomalies$surface[,1])))

newsurfanom=scale(newsurfanom,center=TRUE,scale=FALSE)#center mean values

newsurfanom=ts(newsurfanom,start=1957,deltat=1/12)

What the code does is to least squares fit a line to the original data (take the ‘slope’), find the difference between that ‘slope’ and the intended slope (‘newtrend’) and subtract the difference. The total slope in all the data after running was -0.2C/decade in every one of 5509 satellite data series and all 42 ground series. Now it’s important to remember, this is a linear slope change so the data still has all the short term fluctuations of the original series required for correlation in RegEM. Therefore there is a tiny change in the correlation of the data treated as negligible in this post. The new slope of all the data is a perfect -0.2 C/Decade.

For this reconstruction, after retrending, I followed the same process as in Steig et al and broke it down to 3 pcs and ran RegEM to see if my hypothesis that RegEM scrambles trends according to correlation was reasonable. For new readers, I’ve previously demonstrated that correlation is unrelated to trend at this link.

### Warming the Raw Sat Data

So here’s the result of RegEM on surface and satellite data artificially retrended to -0.2 C/Decade.

Well not too bad, the trend didn’t come out to -0.2 C/Decade but not that far off. The problem in this case is too complex to make huge conclusions, however it’s not too small an issue either, consider that the post 1982 data (half the trend) is direct from a 3pc version of perfectly -0.2 C/Decade trended data.

Here’s the trend from the pre-1982 data.

The red line is fit to the pre 1982 data, yet no slope at all!! Consider what that means for a minute. All the negative slopes assigned by me to be -0.2 C/Decade have been wiped out by the RegEM process. Negative trends copied over each other in the incorrect positive direction until the net slope is zero!!

The first step of the reconstruction process is to take the perfectly retrended satellite data and do a 3 pc analysis of it. The data post 1982 is an exact copy of the 3 pc version of this perfect trend data.

Even that first critical information compression step didn’t represent the trend well. Remember the point of Steig et al. is to try and detect an incredibly sensitive signal in truly noisy satellite & ground data. In this case Steig’s peer reviewed yet completely unverified math wiped out a -0.2 C/Decade trend in the pre 1982 data and the 3 PC estimation (data compression) reduced the trend in the post 1982 data by .013. For the non-math inclined RegEM (pre-1982 reconstruction) doesn’t cause the problem, the problem occurs in the manner used to implement it.

Just to get you thinking about other conclusions, the figure below is the spatial distribution of temperature trend -C per decade- after the reconstruction.

In case you didn’t notice, the spatial distribution contains the same shapes as SteveM pointed out. Since all trends have been equalized, this is a pretty strong confirmation that we’re just seeing Chladni patterns rather than actual weather. It’s nearly time to stick a fork in that one.

Beyond that, I don’t think there’s much new info, only that RegEM with 3 pc’s isn’t working as advertised and Jeff Id spent 3 more hours of his short life to show it.

## Jeff C. said

Jeff – I’ve been tied up for the past two weeks and have not been able to follow as closely as I would have liked, but you really got my attention with this one.

Let me make sure I understand what you did. Every single series (42 surface and 5509 satellite) was forced to a linear trend of -0.2 deg C/decade although the high frequency signature of each was left intact. By this you mean it was set equal to -0.2, not offset by -0.2 (big difference). You then ran through RegEM using the standard Steig process.

1957 to 1981 (surface era) slope is 0. 1982 to 2006 (satellite era) slope is -0.187. Trend map above shows a clear West/East divide despite every single series having the exact same slope as input to RegEM. Wow.

When you get a chance, try using more satellite PCs and also increasing regpar (change them independently of each other to de-embed the effect) to see what it takes to get a uniform trend across the continent. It would also be interesting to look at the continent trend map for 57-81 and 82-06 separately. If either of the two makes the continental trend uniform, it is pretty clear evidence the 3 PC truncation was causing (or at least intensified) the West/East divide that Dr. Steig touted.

I should have my other obligations wrapped up in a day or so and can help if you need it.

## Fluffy Clouds (Tim L) said

I am sorry for asking BUT…. did you try +0.2? would it flatten it out as well?

I’m just asking lol.

You see every one figure’s that THEY would check there math FIRST!

how can THEY say your stupid? U see my point here? they are the ones working with this for an income? we will end up paying taxes based on this crap?

sorry for the rant.

keep at it.

quote from “I dream of Jennie”

“Why did I not think of that?”

Tim

## Layman Lurker said

Very interesting demonstration and more good work Jeff. I’m sure the wheels are turning when it comes to testing the method through controlled data manipulation. Would there be any point to trying your alternative reconstruction methods on the re-trended data as well?

One alternative scenario I was thinking of was another sensitivity type of test. You have a cluster of surface stations on the peninsula with a warming trend. What would happen if the peninsula had only 1 surface station representing an average peninsula trend, but there was a hypothetical cluster of 15 or 20 stations around a station opposite the peninsula where there was a cooling trend. All of these stations would be similar in trend and HF noise just as the peninsula is in reality. A valid reconstruction method should not be sensitive to this type of change as the area trends would not change, only the distribution of stations by hypothetically moving the station “cluster” to a different region.

Feel free to ignore. I feel bad throwing this stuff at you all the time and taking up your time.

## Page48 said

Thanks for spending so much of your time on these reconstructions.

## TCO said

Jeff:

It looks like a neat approach. Kudos.

I would need to think a little more about exacly what you did and what it means before signing off that this is definitive in showing that Steig algorithm creates artifacts. For instance the differing time periods of your trends (stations verus satts) may mean that this is not as trivial as you think. IOW, sure the stations have the trend you make them have…for their entire run. But they don’t nesc have that same trend during the overlap with satts. So given that stations are used to “train” the satt extrapolation, this could mean that the out of overlap extrapolation is not a simple same trend.

## rephelan said

Funny how so many of us can come up with extra tests we’d like to see Jeff do…. seriously, though, keep it up. O learn something new here with every posting.

## Jeff Id said

#1,

Jeff good to see you’re back. I thought the wife gave you a new directive or something. You’re right I retrended the satellite data by slope and recentered it 0 mean for RegEM.

I should be able to run higher order RegEM TTLS on this without much trouble, it converged in like 11 iterations.

#3 I’d like to do retrended through my spatial weighting algorithm since the algorithm fills NA’s with the next closest station just for a QC check.

#5 TCO – Love the constructive comments, really I do. You’re right in your assumption about the length of the series having the total trend but shorter sections having different trend.

I’m way ahead of you on this, one of the advantages of working with the data is I already know the magnitude of the trends through the data series. Things that don’t make the cut for the blog but this is the reason I didn’t choose a magnitude less than 0.2 because it would get lost in RegEM. You can bet money Jeff C realized the magnitude I chose was strong enough to take over the local signal at least on average.

I could have picked -0.1 (close in magnitude but opposite in sign to Steig’s result) and had a much higher likelihood of a positive trend pre-1982 but that’s not the kind of blogging I like to do. This is another example of me not taking advantage of the math.

Still, good job. We can’t over conclude in this case.

—

I was surprised to see the method wiped out the trend and expected to see something around -0.1 but I was just guessing. I was also considering retrending the pre-1982 surface station data separately from the post 1982 data but it wouldn’t teach me much. Maybe I still will.

## Matt Y. said

Jeff,

Nice approach… similar to your hockey stick experiments. This is exactly the kind of sanity check any complicated calculation algorithm should be expected to pass. It certainly raises questions about the way RegEM is being used.

## TCO said

I also need think for a second if your test is a tautology. For instance, the algorithm is designed to do meaningful pattern matching more than just trend extrapolation, so if you set up a test which is that, you’re not really giving the method a fair shot.

I’m not saying this. Just saying I need to think about it, before I can sign off on the track you’re running down.

## Matt Y. said

Jeff,

What about setting different parts of the Antarctic to different trends? By setting all of the trends to the same value, you are essentially forcing it to use high frequency correlations to interpolate values since there is nothing else to grab onto. If you set different areas to different (possibly even opposite) trends, it would be interesting to see if the RegEM process could pick them out.

## Jeff Id said

#10, There are a thousand things to try, I’m trying to pick the best ones. While it would be interesting, I’m not sure it’s worth the several hours of time. If you look at the relationships between correlation and trend in my other post.

I was considering setting the peninsula to positive known and the rest to negative known just to see the mess but I’m getting interested in other aspects now and don’t know if I’ll spend the time.

## Layman Lurker said

#10

An interesting comment Matt. You could apply the logic in reverse though. Leaving only HF correlations to work with seems to show/confirm 3 things: 1) Inspite of re-trending and removing trend variability, the spatial patterns appear unaffected compared to Steig. 2) It is HF correlations, not trend, which account for spatial and temporal distribution of trend in the reconstruction. 3) Distribution of trend is an artifact of the method.

Given the fact that actual trend is variable, and not constant, it is easy now to visualize that localized trends could be smeared randomly and likely disproportionatly accross the continent. Things are complicated even further with almost 1/2 of the pre-1982 surface stations input comming from less than 5% of the continent. Jeff’s musing about re-running the above with a warming peninsula would be interesting indeed.

There has to be dozens of potentially meaningfull ways to manipulate the data and tease out the sensitivities. Any grad students out there reading this blog?

## WhyNot said

There are papers published that discuss the limitations of RegEm being used for reconstructing climate data and effect on trends. Some of the papers I have read discuss HF noise, SNR’s and other issues related to the math of RegEm. The interesting thing about the few papers I did read is they came to basically the same conclusion, RegEm artificially biases on a positive basis the resulting trend. What this all means still needs to be determined…. They are trying to understand if this is the correct or viable method for reconstruction. I would like to understand the math behind RegEm as a good understanding will answer all or most of the questions raised above. I know there is no “detail” in this response, but everyone is asking the same basic and simple question, “What does RegEm do to the data and what is it’s limitations?” Having Jeff ID run all these variations might provide us with an insight to the question, however, we really need a group to fully analyze the functional limits of RegEm based on HF, SNR, spatial interpretation (probably is zero or near zero), missing data frequency, etc. So good luck to you all trying to learn RegEm and hopefully someone/or a group will embark on this endeavor.

## Ryan O said

Jeff,

Great work. Here’s two simultaneously possible thoughts:

1. This potentially shows that the station covariance is

notconstant in time. RegEM assumes that it is. If the actual covariance changes, RegEM will compute nonphysical results.2. This definitely shows that 3 PCs are not sufficient to describe the actual covariance at

anytime. The fact that I can still clearly see all three eigenvectors (doing the superimposing in my head) in the finished product means that additional spatial information is needed to produce the smooth, -0.2 Deg C/decade trend in the actual data.Excellent demonstration, Jeff. 😉

## Layman Lurker said

#14

Ryan, Would progressively higher order processing (when you have autocorrelation) converge on the actual -2C trend? I’m not so sure about that.

## Ryan O said

#15 No idea. Actually, that’s not true. I have an idea: I suspect it would get closer, but I wouldn’t hold my breath about it ever matching.

## Jeff Id said

Does anyone know how to list the files contained on an FTP site in R?

## Stephen McIntyre said

#17. Try list.files(url) though FTP sites are often blocked to this sort of read. Sometimes you can get the html code and then parse it with grep and substr.

## Jeff Id said

#18, thanks, it didn’t work for me on the ice data from NSIDC. I’ll try the code.

## Fluffy Clouds (Tim L) said

shows that the station covariance is not constant in time. RegEM assumes that it is. If the actual covariance changes, RegEM will compute nonphysical results.

there is the phase change, very interesting.

## joe-h said

You aren’t the only one to notice there is something different about the Antarctic Peninsula region:

file:///C:/Documents%20and%20Settings/Owner/My%20Documents/GlobalWarming/JeffID/Maritime%20Antarctic.gif