## What’s Wrong with RegEM

Posted by Jeff Id on April 4, 2009

I have spent the morning working with higher order RegEM on the actual satellite data. This analysis used Steig’s AVHRR data. I performed singular value decomposition to calculate 20 PC’s which were then used in RegEM. I placed this data in a matrix alongside the 42 surface stations anomalies used in Steig09 and ran RegEM with regpar set to 10. Regpar=10 is about as high as RegEM will go with this data and still converge. The point of the higher order RegEM is to achieve a more natural distance vs correlation plot. We’ve been over that already at this post.

### Don’t Eat the Fish

Let’s start with the final trend. Steig09 reports the following trend:

The continent-wide trend is 0.126 +/- 0.07 Deg C perdecade.

We know from a number of previous works that this trend is the result of spreading individual station trends over thousands of kilometers of RegEM grid. By using higher order PC’s we can reduce the artificial spreading improving the location of the information presented by specific stations. At least that’s the theory, here’s the trend I got with 20 PC’s and a regpar of 10.

The graph above shows almost no trend at all. This is what we expected from a graph by Steve McIntyre shown below which represents trends from multiple reconstructions of the automated surface weather station data combined with the manned surface station data using a method analogous to RegEM .

The plot above is interesting in that the trends are apparently asymptotic to zero. It looks like a linear drop at first glance but the x axis in this case is exponential. My reconstruction of the satellite data seems to be in good agreement with SteveM’s trends at around 10PC’s. I think I have figured out why it’s happening. Here is a plot of the spatial distribution of trends.

Look at the nice red warming peninsula, excellent separation from other regions and an odd 3 mode pattern which corresponds to some of the patterns we see from running PC analysis on spatially autocorrelated data. The graph below is again from Steve McIntyre.

The problem I have with this whole thing other than the fact that I am the first individual to discover the incredible and dramatic cooling of the Ross Ice shelf (seven o’clock above) lies in the correlation assumptions. The assumption is that we know by the covariance of a station how well its long term trend matches. Unfortunately this is NOT the case. **The covariance between surface stations and satellite grids is based almost entirely on high frequency signal**, this is more obvious when you look at the correlation plots at this post from earlier which took me a couple of tries to get it right.

### Mullagain

The values from each trend in the post above are spread back and forth along the correlation slope line. This high frequency correlation is the result of normal monthly fluctuations of over five degrees in some cases. This absolutely swamps a signal of a few tenths degree C per decade, therefore for Steig’s paper to be correct, we need to have some extraordinary faith that the short term and long term trends are in the same direction. Again, it turns out this key and very important assumption before running RegEM is not realistic.

Here is a plot from another post I did looking at correlation called

### The Blend…Inator!!!

These are correlation plots, NOT temperature plots. The little green + in the peninsula represents the surface station correlated to the rest of the satellite data. The raw data on the far left represents the most natural version of the satellite data. In the first plot below the bulk of the antarctic has a weak anti-correlation to this station. The warming of the peninsula region is well known from the high density and completeness of temperature stations in the region. If we were able to run RegEM TTLS without loosing any data as in the graph on the left, it would apply the positive long term trend to the tip of the peninsula and a negative trend to all the blue areas with strength according to the extremeness of the correlation. The ross ice shelf mentioned above is dark blue so it would receive a strong negative contribution to it’s long term trend.

The middle (left, middle, right) graph below is the satellite data as presented from only 3 pc’s. It doesn’t look too bad in this plot but the ross ice shelf has a neutral to slightly negative correlation. The right side however, is reconstructed data from Steig09 and has only two PCs. It actually shows a positive correlation. This positive correlation is entirely artificial and a result of the low resolution of the mathematics used.

.

Let’s see another example.

These graphs show the same problems, negative correlation across the continent. This means that a negative long term trend from the peninsula is blended across the continent.

Well maybe the trend in the plots is still correct, after all I haven’t shown proof. There are two bases on the edge of the Ross Ice shelf Scott and McMurdo. Scott is a large base and is only a few miles distant from McMurdo. Due to the age of the bases, the temperature record for both stations is complete.

These two stations correlate well with the entire antarctic but they both (in the raw data on the left) show a negative correlation to the tip of the peninsula. Therefore if Steig et.al have made a correct assumption, Scott and McMurdo both should have opposite long term trends from the peninsula stations Arturo Pratt and Esperanza.

I should note that From the higher order RegEM trend plot of the whole antarctic (3rd figure from the top of this post) this is exactly what RegEM predicts. A strong negative trend in Scott and McMurdo. Fortunately we have the raw surface station data to check the accuracy of the assumptions which again are** core assumptions of RegEM**.

I was surprised to find the same exact long term slope. What does the covariance plot look like.

Note the negative covariance of the high frequency data and the poor coorelation. I’m an engineer so I’m used to looking at a graph and making a conclusion so before you ask, r values and slopes are not required for interpretation of this uncorrelated scatter graph. The high frequency correlation is terrible but the low frequency slopes are the same.

Conclusions:

RegEM operates under the explicit assumption that covariance can properly assign long term trends according to station covariance. For this to be true, the long term trends in this case must be of a magnitude that they overtake the short term noise in the system, or the long term signal must follow the short term signal in the system. Neither of these is the case in Steig09.

In effect using RegEM on the satellite vs surface data in this case basically randomized the trends, it looks normal due to the heavy filtering caused by low PC’s. I’m going to think about it overnight before I say much more.

—

Code for this post is HERE.

## TCO said

I’m having a hard time reading this post on a laptop. The wide side margins clip the images. I can click on the images to see them anyhow (pain, but can) but the CA one, I can’t even do that.

Also, it is hard to follow your logic (not arguing the science…just even following what you are saying…given all the previous blog posts and all.

## TCO said

I’m not meaning to be a grammar nit. Since I mess up all the time. But given complicated ideas and logic flow…it is that much harder to follow when the rhetoric is flawed:

“This analysis used Steig’s AVHRR data which I performed singular value decomposition to remove 20 PC’s for RegEM.” Missing a preposition??

## TCO said

“We know from a number of previous works that this trend is the result of spreading individual station trends over thousands of kilometers of RegEM grid.”

1. We don’t KNOW. It’s not a predicate. We suspect. We are trying to prove. Etc. We have not nailed this frog to the barn yet. I’m not saying we won’t get it nailed. But let’s (please) not assert what we are in the middle of deriving/proving. It’s not even that it is over-skeptic-confident but that it is confusing for a reader.

2. blog posts and in process trial analyses are not = “works”. Works implies some sort of more finished product. Not the lab notebook. Let’s please not misuse words and try to make ourselves look fancy and such.

## Jeff Id said

#3 I worked all day calculating. I also fixed the sentence above.

And we do know the trend is the result of spreading individual station trends across thousands of kilometers. These things are known from the correlation plot.

## TCO said

Don’t eat the fish link is busted. Clip the extra date month to fix.

## TCO said

What is regpar and why important to have high? I understand why having more PCs will help give more geographic detail to the extrapolation. But what is regpar and why does it need to be high? And if you leave the regpar lower, can you go to even more PCs? How many PCs do you need to have the whole data set? Thousands?

## TCO said

“The plot above is interesting in that the trends are apparently asymptotic to zero. It looks like a linear drop at first glance but the x axis in this case is exponential. My reconstruction of the satellite data seems to be in good agreement with SteveM’s trends at around 10PC’s.”

1. Is it converging to zero or to some small value (for instance the value that you get if you had done your gridded work originally). In other words converging to a value that represents the area weighted sum of the trends across the continent. Which may not be zero.

2. I think to more broadly understand similarity to the SM trends, you would need to do calcs at different PC values like he did and then just plot the two graphs.

3. It seems similar in direction, but not in extent. You take 20 PCs to get to what he had at 10. What do you think is the reason for that? IOW, what difference in the methods drives that?

## Jeff Id said

Regpar =10 essentially limits the number of PC’s to 10 for TTLS which is Truncated Total Least Squares. It’s kind of funny to say it that way because it’s unfortunately more complex than that but the result is similar.

There are two limits to the RegEM calculation neigs and regpar. Essentially, the use of 20 PC’s didn’t make a huge differencer because I only ended up with 10ish in RegEM. The math is complex but the result makes sense. Since regem relies on a covariance matrix for calculation the long vs short term covariance issue is simple.

## TCO said

Yeah, it does look like a Chladni pattern. Would be interesting to see what sort of pictures you get with numbers of PCs. Is it just giving more and more lobes, or is it sort of converging to a similar picture? Also wonder why you have 8-PC drum pattern, when you have 20.

## Jeff Id said

Let me try again

There are two limits to the RegEM calculation neigs and regpar. Neigs is a hard clip to the number of PC’s, regpar is a clip on the number used for the ttls portion. Essentially, the use of 20 PC’s in the matrix setup didn’t make a huge difference because I only ended up with 10ish in RegEM.

The math is complex but the result makes sense. Since regem relies on a covariance matrix for calculation the long vs short term covariance issue is simple.

## TCO said

What drives the convergence issue? Why can’t you have more? Is it computation or something about the data available? It seems like there should be some algorithm possible that would just take the entire data set and use that to crunch to find the expected extrapolations.

## TCO said

“The assumption is that we know by the covariance of a station how well its long term trend matches. Unfortunately this is NOT the case. The covariance between surface stations and satellite grids is based almost entirely on high frequency signal, this is more obvious when you look at the correlation plots at this post from earlier which took me a couple of tries to get it right.”

1. This is an interesting idea and different and should have its own paragraph or even section (not mashed into the end of the previous Chladni stuff.)

2. I did not follow from the mullagain post, how the high frequency versus low frequency issue was developed.

3. I really don’t completely understand what you are asserting. although I have an impression that you are saying that oscilations match but not trends? And I don’t see the proof of that laid out for examination.

## TCO said

“The values from each trend in the post above are spread back and forth along the correlation slope line. This high frequency correlation is the result of normal monthly fluctuations of over five degrees in some cases. This absolutely swamps a signal of a few tenths degree C per decade, therefore for Steig’s paper to be correct, we need to have some extraordinary faith that the short term and long term trends are in the same direction. Again, it turns out this key and very important assumption before running RegEM is not realistic.”

1. OK, I was expecting it all to be in the Mullagain post, but see that you are developing the idea from previous para here.

2. It’s not clear what is being correlated on these graphs. Is the explanation from some earlier post before you picked up the golf ball?

3. I’m not disagreeing…just not groking the issue with the high frequency correlations. What are the large numbers anyhow? The figure has no caption or axis labels. And anyhow, would you be happier if high freq didn’t track???

4. I still think the issue of high freq to low freq is interesting…but I just don’t get enough from your explication to be able to judge the RegEM concern you have.

## Matt Y. said

Jeff, interesting stuff as always.

Just curious, have you been able to duplicate the Steig calculation? I’m guessing you have, but I haven’t seen it on your site. It would give a little bit more of a warm and fuzzy to the other calculations.

“In effect using RegEM on the satellite vs surface data in this case basically randomized the trends…”

Wow. That would be huge if it could be proven conclusively. Could you elaborate on how you came to that conclusion.

## Jeff Id said

I don’t see which figure doesn’t have axis labels. The questions are fine, I think you’re getting closer.

If you look at the last 3 graphs we have two stations, one in the peninsula, one on the ice shelf. They have the same positive trend of equal magnitude so you might expect high correlation yet the last graph shows they correlate negatively. This is apparent in the satellite data as well, where you can see that the Ross ice shelf data correlates negatively quite strongly with the tip of the peninsula.

The result is that RegEM incorrectly and a bit randomly copies a strong negative trend to the Ross ice shelf. Third figure in the post.

## Jeff Id said

Sorry, the answer was for TCO but it also applies to Matt Y’s question.

## Jeff Id said

Matt,

I’ve done it before at this post

https://noconsensus.wordpress.com/2009/02/20/satellite-temperature-trend-regridded/

I haven’t repeated it from the raw sat data but I was able to extract 3 similar PC’s from the raw (as presented) data.

https://noconsensus.wordpress.com/2009/03/27/preliminary-pca-of-steig-data/

I was considering running these through the RegEM in the near future. The similarity means the results will be pretty close but not perfect.

## TCO said

Jeff:

The mullagain plots lack axis labels or caption. It’s not just a clerical chide. It is actually hard to follow the story.

## Jeff Id said

#18 It’s no problem, I didn’t take it as a chide I really didn’t know. I see what you mean now. The graphs were too small so the original post described that the data in the mulligan plots was a correlation between surface station data and Comiso’s sat data at the same location. I’ll add some words back in.

## TCO said

Jeff:

I think your point ons the high freq is interesting and should be developed further.

It doesn’t bother me so much that the overall trends for the two stations were similar but the corellation poor. I assume the algorithm will just not then use the one as a predictor for the other. Actually in general, I thought we skeptics chided the AGWere for having matched trends but NOT matched high freq response.

Parting shots for now:

It’s still bad style to assert as already previously proven what the blog post is trying to show. I mean the reader is expecting to treat that as a given and go on to next steps.

Appreciat you trying to do synthesesis, but it is still a pain to read this collection of blog posts, parsing between comments on cartoons and development of ideas versus final compilation. The skeptic community should have sympathy for readers and not expect this sort of presentation to be a proof. It should at least be an organized white paper.

You have some intersting ideas.

## Jeff Id said

#20 Imagine trying to pile this all together. It’s not like they pay me 6 figures to work on it.

You said

I assume the algorithm will just not then use the one as a predictor for the other.The algorithm is not that smart. Actually it simply reverses the trend upside down according to the magnitude of the de-correlation. That’s what makes the Ross ice shelf so cold in the third figure above and that is the evidence that RegEM cannot work for this analysis with this data.

The more I think about it the more apparent it becomes. I’m holding my tongue for a bit though and will do some more confirmation work tomorrow.

## Layman Lurker said

Great insights here Jeff.

Some thoughts and questions:

1. Might you be thinking about a pseudodata experiment using both high and low frequency signal to replicate the observations of Scott/Arturo?

1. Depending on how infilling was done for cloud masking, it may have filtered or distorted the high frequency signal – another complicating factor for covariances.

2. Spatial autocorrelation is likely related to high frequency signal then as well. No?

## Jeff C. said

Good post Jeff. A lot to digest all at once, but here are a couple of thoughts.

We know from the distance correlation scatter plots that the Comiso satellite data and the surface data (the inputs) had very good distance correlation while the Steig recon (the output)has poor distance correlation. What you have posted seems to indicate that using 20 PCs and regar = 10 has good distance correlation, but it might be interesting to see the scatter plots.

Monaghan has something that looks sort of like your trend map, but his runs from 1960 to 2002. His is multi-lobed, but the lobes fall in different positions. It might be worthwhile to try your recon running it over that time period.

We really have two separate issues here; the reduction of the satellite data to 3 PCs and using RegEM with regar=3 for infilling the 1957 to 1981 portion. I think they both have the same general effect (spatial and temporal smearing) but attempting to de-embed them from each other might yield some clues.

## Jeff C. said

I’ve reread the post and the hi freq/lo freq stuff is starting to sink in. You ought to be able to devise a test by creating a bunch of time series with both hi and low freq signals of various amplitudes and correlations. Start running them through RegEM and see if there is a clear point where it switches from one to the other.

It would be very interesting to see how large of an amplitude a decadal-type signal needs to have for RegEM to lock onto it in the presence of high freqeuncy clutter.

## Jeff Id said

#22 Your first and third points have my interest now. I think it’s time to finish some of the demonstrations I’ve been working on using artificial data.

#24 I’m not sure how to show how large the amplitude needs to be. When I think about it, the short term signal which in this case is noise will always have a distorting effect on the RegEM output no matter how small. You know as well as anyone in this case the short term signal is at times around +/-5C which is very very dominant over the small tenths of a degree decadal trends. I don’t think RegEM can work for reconstructing the low frequency trend.

I think a dim little lightbulb just went off in my hamster wheel, there goes several more hours.

## TCO said

I think the “high frequency” match(monthly…not minutes or something) is better than if there were no match. Imagine how much we would (and have) kvetched about things like bcps or the like that match trends over long periods, but lack wiggle-matching type of proof of correlation?

The other question though is does that match over-dramatize the usefulness of the instrument? I don’t really think so and feel better about a high freq matched set of proxies than low (since low could just be luck).

Another question is if there are long trends that cause the instruments to separate in terms of match. That’s really hard to prove one way or the other, but I think you have to have something like the unitarian principle.

————————-

I think the more curious issue is just how much help this whole sat-data extrapolating exercise can be at all?

I mean think about it. We have two sources of data: (1) stations and (2) satts. Stations have double the time duration but have much less geo coverage. Satts have way more granular coverage and cover some large interior regions that stations don’t access, but have half the time duration coverage. There are a lot of other differences, with cloud cover and satt drift and even some coverage gaps affecting satts. With some gaps in record on the stations. With a different physical thing being measured on each and different sources of noise. All that said, I think the most important differences between the two are what I said at the beginning: “satts more geo-coverage, but less time; stations less geocoverage, but more time”.

What Mann and crew tried to do was see if they could combine the data to get a better picture of what happened in the non-station covered areas, during the time that satts were not there. It’s really a noble and interesting challenge. That does not mean it works right though. But it is a really cool thing to try to do.

What they basically do is try to extrapolate the satt coverage back in time, by using the correlation to stations. The idea is can you get something more useful than just gridding the space and doing some Hansen-like average of the stations for the 50-years, by this satt extrapolation game.

But in order to do so, there have to be some characteristic patterns of response that the algorithm would pick up. For instance, if we think of the stations as roughly covering the exterior and the satts giving better interior coverage: is there some dynamic where a small increase in temp at the exterior is correlated with larger increases in the center? (Perhaps the exterior might be more damped because of ocean and the interior wiggles more.) Or perhaps there are characteristic side to side patterns (like happens in NA with El Nino) where when one part of the country gets hotter, you know another got colder.

I don’t know whether these kinds of sophisticated responses occur in the system. But if they did, then some attempt like what Mann tried, might give us a much better answer than just geo-gridding the stations and ignoring the ‘lites. The question is do those types of things occur. Perhaps looking at the data during the period of multiple coverage would help with that assessment.

Of course, I’m being very general when I talk about the benefit of this approach. There are a lot of pitfalls with complicated algorithms, so we need to watch out and make sure that (for instance) the algorithm would converge to the expected result (i.e. the trivial one) if none of these sophisticated things is going on. Also, it seems that we would want more data rather than less, to allow this algorithm to best find the sort of pattern response that helps us get non-trivial insights. And we would want to watch out for anything in the machinery, the guts, of the method that biases it.

But all said, I think it’s an interesting approach to the problem…and not one that should be dismissed out of hand.

S

## Jeff Id said

TCO,

Sometimes you blow me away. Great comment.

There are several I think important points in here. One example which TCO picked up on that I’ve been considering is this.

“Or perhaps there are characteristic side to side patterns (like happens in NA with El Nino) where when one part of the country gets hotter, you know another got colder. ”

Think about that, we see this oscillation in the PC3 data movie where the southern portion below the mountains often oscillates back and forth with the northern part of the continent. This effect guarantees decorrelation and is visible in the 5th graph from the top. The problem with that is that both halves could be warming at the same rate yet RegEM would fight itself applying both positive and negative long term trends to the same area muting the true signal.

“But all said, I think it’s an interesting approach to the problem…and not one that should be dismissed out of hand.”

It is an interesting approach the days of time I’ve spent on it pretty well guarantee I wasn’t going to dismiss it out of hand. Up until yesterday morning I was planning to demonstrate several improvements to the method which may have revealed the same results as Steig and I would be writing a bunch of supportive posts.

1. using only the surface data, with sat data as the correlation matrix only.

2. Higher order to reduce surface station correlation spreading

3. Very high order using truncated pc.

4. Using non-corrected raw surface station data to create a trend.

Unfortunately, this latest problem has stopped me in my tracks. After a night of thinking on it, my suspicion is that Expectation Maximization simply is unusable for this type of spatial reconstruction where the covariance data is not related to the important long term trends in the signal.

## Layman Lurker said

#25

“#22 Your first and third points have my interest now.”

What do you mean, I didn’t have a third point. 😉

BTW, congratulations on the Ross Ice Shelf temp trend discovery. Hereby nicknamed: the “Id Ice

shelf”. Can a Nature article be in the works?

## Layman Lurker said

I think what TCO talks about in #26 is common sense when you think about it. Sometimes we don’t see the forest for the trees. One only has to look at the east station temp patterns comparing Davis, Dumont-Durville, and Casey to see how well the high frequency wiggles line up. Ditto for the peninsula stations. And if one was to look at these station temp series and compare by eyeball, we would have no problem grouping these regions as being distinct without knowing the individual linear trends.

## John F. Pittman said

Jeff,you said “”#24 I’m not sure how to show how large the amplitude needs to be. When I think about it, the short term signal which in this case is noise will always have a distorting effect on the RegEM output no matter how small. You know as well as anyone in this case the short term signal is at times around +/-5C which is very very dominant over the small tenths of a degree decadal trends. I don’t think RegEM can work for reconstructing the low frequency trend.## Couldn’t you do a small run starting at +-1SD and add +-1SD to a total of +-6SD to test the noise response in your artificial data?

## Jeff Id said

I’m guessing the long term trend would have to be 2:1 over the short term trend OR linearly related. I.e. all high correlation short trends of a specific pattern with a positive slope – defined as positive and anti-correlated short trends of a specific pattern with a negative slope – defined as negative.

## Jeff Id said

I wrote that way because currently we’re 1 in 10-30 long vs short term and we probably need to be 2 in 1. It ain’t too close.

## Layman Lurker said

I have always been somewhat puzzled by the negative correlation of the tip of the peninsula, not just to the continent, but to inner half of the peninsula. The corresponding linear trends would certainly not reflect this and to me it is better explained by high fequency pattern differences.

When watching the animated patterns for Antarctica at Weather Underground, you can see the relative stability of the peninsula tip – reflective of ocean dominance (at least intuitively). Inland, you see the polar air “vortex” pattern. In the boundary area between the ocean and the polar dome, you see a highly turbulent advection and very dramatic temp swings.

Obviously, we are only talking about time scales of a few hours in the animation, but I think might show something that is connected with frequecy patterns.

http://www.wunderground.com/global/Region/AN/pxTemperature.html

## Hu McCulloch said

Jeff —

Great work! I especially like Figure 1 and 3 above. What happens if you do the same thing with neigs and regpar both = 3? Does anything like ant_recon.txt emerge? We know Steig set k = 3 in the end, but perhaps it makes a difference where this was done.

## Layman Lurker said

#33

Apologies, there is no negative correlation between the peninsula tip and the inner peninsula. I confused myself looking at the wrong “Blend-in-ator” plots.

## Jeff Id said

#35 actually there is a small area of negative correlation. I thought this is what you were referring to.

#34 Roman, Ryan or Jeff C (not sure which) showed the reconstruction is closer when neigs is unset giving a full set of PC’s for the calc.

Instead of doing my fun work, I’m finishing up a reconstruction according to regpar=3 just to show the similarities. So far it seems pretty close.

## Layman Lurker said

#36

I only see the negative correlation when comparing some of the coastal stations (like Casey and Davis) with raw AVHRR. Not when Aturo is compared to raw. It is just a weaker postivie correlation.

BTW, my hunch is that the peninsula tip would still have distinctive high frequency patterns compared to inner peninsula, just not negatively correlated.

In my mind, it enhances the argument for higher order factors to be included in the reconstruction. Particularly prior to 1982 when so much of the input data comes from the tip of the peninsula.

## michel said

Jeff, no 15, very nice, pure Zen, the discoverer of Ross Shelf Cooling, of course!

Suddenly he was enlightened!

## Fluffy Clouds (Tim L) said

The freq. keying is fascinating!

The algorithm is not that smart. Actually it simply reverses the trend upside down according to the magnitude of the de-correlation …….. Here we are with the inversion I talked about!

Great work here Jeff!!!!

TCO putout too. LOL

## TCO said

You’re a freq. Do you post high?