# the Air Vent

## Because the world needs another opinion

• JeffId1 at gmail dot com

Everything you need to know about Climategate. Now available at Amazon.com

• ## Subscribe to Blog via Email

Join 187 other followers

## Happy Feet – Filtermatics

Posted by Jeff Id on March 30, 2013

Something interesting at WUWT happened today.  This isn’t a typical issue as of late and requires a bit of math skill.   A post by Willis Eschenbach brought up some old memories of days where skeptic blogs like this one, were math centric.  Fortunately the math which Willis discusses this time, is relatively lightweight stuff, and it happens to involve the fortuitous filtering activities of Mannian filter-matics.

I highlighted an email on the topic a few weeks ago here which contains a quote that I thing belongs in Willis’s article.   Michael Mann has long been interested in filtering methods which promote the “Cause”,  I have to say that Willis’s example puts a spotlight on how awkward the team has been at promoting fortuitous filters.

5 PM 10/14/2003 -0400, Michael E. Mann wrote:

Dear All,
To those I thought might be interested, I’ve provided an example for discussion of
smoothing conventions.  Its based on a simple matlab script which I’ve written (and
attached) that uses any one of 3 possible boundary constraints [minimum norm, minimum
slope, and minimum roughness] on the ‘late’ end of a time series (it uses the default
‘minimum norm’ constraint on the ‘early’ end of the series). Warming: you needs some
matlab toolboxes for this to run…
The routines uses a simple butterworth lowpass filter, and applies the 3 lowest order
constraints in the following way:
1) minimum norm: sets mean equal to zero beyond the available data (often the default
constraint in smoothing routines)
2) minimum slope: reflects the data in x (but not y) after the last available data
point. This tends to impose a local minimum or maximum at the edge of the data.
3) minimum roughness: reflects the data in both x and y (the latter w.r.t. to the y
value of the last available data point) after the last available data point. This tends
to impose a point of inflection at the edge of the data—this is most likely to
preserve a trend late in the series and is mathematically similar, though not identical,
to the more ad hoc approach of padding the series with a continuation of the trend over
the past 1/2 filter width.
The routine returns the mean square error of the smooth with respect to the raw data. It
is reasonable to argue that the minimum mse solution is the preferable one.  In the
particular example I have chosen (attached), a 40 year lowpass filtering of the CRU NH
annual mean series 1856-2003, the preference is indicated for the “minimum roughness”
solution as indicated in the plot (though the minimum slope solution is a close 2nd)…
By the way, you may notice that the smooth is effected beyond a single filter width of
the boundary. That’s because of spectral leakage, which is unavoidable (though minimized
by e.g. multiple-taper methods).
I’m hoping this provides some food for thought/discussion, esp. for purposes of IPCC
mike

It never seems to end, and the “happy” filtering nonsense started being noticed by Willis Eschenbach some time ago.

1. ### kim2ooosaid

Reblogged this on Climate Ponderings.

2. ### Skiphilsaid

c’mon Jeff, what’s wrong with sharing a helpful filter now and then, just between friends:

http://climateaudit.org/2011/01/07/uc-on-mannian-smoothing/#comment-257461

3. ### Genghissaid

Conmen, politicians, preachers, climate scientists, enough said.

4. ### OK S.said

Looks like WordPress mangled the WUWT link. Should be:
http://wattsupwiththat.com/2013/03/30/dr-michael-mann-smooth-operator/

5. ### Nick Stokessaid

Jeff,
Here is what Mann means by preserving the trend. It’s a very worthy aim.

• ### Jeff Condonsaid

Not in a temperature series with an unknown future. 😀

• ### Genghissaid

Correction, don’t you mean what Mann the plagiarist would have meant if he was Willis?

• ### Carricksaid

Nick Stokes:

Here is what Mann means by preserving the trend. It’s a very worthy aim.

Remember we’re talking about smoothing, which requires knowledge or prediction of future behavior.

So in this context, “preserving the trend” is not a worthy aim because it implies you know what the trend is that you need to preserve. It’s listed under “here, let me show my confirmation bias.”

Wishing I had more free time, I’d post an example of how to do it right, but this is clearly wrong.

• ### Nick Stokessaid

“So in this context, “preserving the trend” is not a worthy aim because it implies you know what the trend is that you need to preserve.”
No. All sorts of methods, incl numerical, aim to preserve something. You normalize a filter to have area 1 so that it will preserve area. That’s a worthy aim too, It doesn’t mean that you know what the area will be.

Smoothing can always do something to trend. MRC preserves it. Min slope alters it, for no apparent reason in this example. Which is better?.

• ### curioussaid

Nick – what does filtering/smoothing add to that chart you give?

• ### Nick Stokessaid

It’s a test of the method. You can see what the trend should be, and that MRC preserves it, and Min slope doesn’t. That’s how methods are normally tested, on problems where you know what the result should be. That gives confidence to use it when you don’t know what to expect.

• ### kuhnkatsaid

Nick the Prognostidigitator at work again!!

• ### Carricksaid

It produces as trend whether there is one there or not, which is why it’s not valid unless you know apriori the trend will extend into the future.

• ### curioussaid

I was asking what the smooth adds to that data as it doesn’t look in need of smoothing. I know you and the guys who post here are on top of the maths but what seemed odd to me was that you were presenting a curve not far off y=mx+c and applying a smooth to it to demonstrate some property of the smooth.

IMO if it were to be a more powerful test or demonstration of the smooth’s trend preserving power it would be on a signal more complex than a monotonically increasing one. If you were supplied with a truncated data series that was varying wrt to time and you applied this smooth to it, how confident are you that when you were supplied with the rest of the series, the trend indicated by the smooth would match the actual trends (over whatever interval was chosen) going forward into the new data?

• ### Nick Stokessaid

Curious,
Smoothing data, one often has in mind a model where the data is a smooth curve with added noise, and the underlying curve is to be made clearer. This smoothing is linear, so you can think of its effect on noise and underlying separately. Any smoother reduces noise, and really the only difference between them there is the reduction factor. But they can change the underlying too, and you want that to happen as little as possible.

So we test the operation on curves that are already smooth, and the ones to start with are the constant and straight line. To leave a constant unchanged, the area under the smoother must be 1. To leave a line unchanged, it has to have a zero lag property.

When it comes to a region with an end, the same considerations apply. MRC leaves a line unchanged (“preserved”). That’s a big virtue. Its downside is that the noise attenuation drops to zero at the end. It leaves the end-point unchanged.

• ### curioussaid

Thanks Nick – I think that tallies with my understanding but I might be wrong. I missed you responding to how the filter can or can’t be relied on for trends going forward into new data but I am left thinking it can’t? As far as noise attenuation dropping to zero goes I think that is effectively saying you are no longer filtering? If so, that raises the question of why not just stop the filter at the limit of its usefulness?

• ### Nick Stokessaid

Curious,
“why not just stop the filter at the limit of its usefulness?”
But useful for what? Different filters do different things. Min slope gives a less noisy endpoint estimate, but useless trend there. MRC gives a reasonable trend estimate but noisy endpoint.

On going forward, in smoothing you’ve assumed only that there is a smooth underlying function in the interval you’ve smoothed. Now smooth functions do often proceed smoothly into the future, so you can assume that as a predictor if you like, but it’s an extra assumption.

• ### Carricksaid

Curious, I think Nick’s big hang up here is he ignores the role of uncertainty. Some methods that are useful when you have noiseless data are practically worthless when the signal-to-noise is poor, because of the way they amplify noise.

This end-point reflection method is particularly bad, because it anchors the “future prediction” on the last day of data.

Nick really likes the way his CO2 trend curve looks. Suppose you did a 10-year average of CO2 and extended the smoothing to 2013 using the end-point method. If CO2 continues to grow as expected your trend won’t be too bad (assuming your data isn’t too “wiggly”). But suppose hypothetically we get a major asteroid impact in say, 2015. I pretty well guarantee the trend at the end of Nick’s curve will be wrong.

Nick likes his result because it fits his expectation that the trend should continue through the immediate future.

There isn’t anything particularly wrong with using our knowledge of probable future growth to forecast. In this case because the signal-to-noise is so good, even the reflection method works well. However, when you have noisy data that you want to smooth (Mann’s case) it’s far from an ideal method.

I suggested on another thread, the method described in Harvey and Shepard. Fortunately this is already available for the R-language in various packages (reviewed here.)

The more conservative approach is to chop the last 1/2 of the smoothing window (don’t extend beyond the end of where you have data), and that’s my preferred approach.

• ### Carricksaid

Took a couple of minutes off to sit down with it. I think it turns out a sufficiency condition for the end-point reflection to work is that the noise needs to be dominated by low frequency (e.g. “1/f” noise). You run into problems with the method, when the high frequency component of the noise in the measurement has a large amplitude.

The Keeling curve is an example of the former (excluding asteroid strikes and the lot). Mann’s proxy data is an example of the latter.

Mind you, I’m not defending setting the slope to zero at the end point. That’s just silly.

My criticism is simply that this particular method for extending the smoothing to the end-points is a poor method for typical climate science applications (low SNR, large high frequency component)

Here’s an example: The real series has a zero slope at t=0, the end-point reflected method produces an artifactual positive slope.

This is all well known in the community I work in by the way.

• ### Carricksaid

By the way, Nick, I noticed you were using a triangular weighting function in your smoother. when I do a running average I usually do an unweighted average (sufficient if the data have a 1/f type noise floor) but I prefer a raised cosine (Hann) window if there’s a lot of high-frequency noise in the data.

For what it’s worth….

• ### Nick Stokessaid

Carrick,
“But suppose hypothetically we get a major asteroid impact in say, 2015. I pretty well guarantee the trend at the end of Nick’s curve will be wrong.”

No, the trend isn’t wrong. It doesn’t claim to predict such events. Your speedo causally reports the current trend of distance covered vs time. It might be reporting 80 mph and then you run into a truck. Doesn’t mean you need to get the speedo fixed.

• ### Carricksaid

Ok it’s not wrong. It’s just “reality challenged.” 🙄

You saw my example, right?

Click to access end-point-reflection.pdf

• ### Carricksaid

Lucia blogged on it too

http://rankexploits.com/musings/2009/more-fishy-how-doesdid-mann-guess-future-data-to-test-projections/

• ### Carricksaid

Looking through that thread, I can see that Nick was in the same place in 2009.

Curious— I mostly added these notes in case you were still following the thread in the hopes you found them useful.

• ### curioussaid

Carrick, Nick – thanks for the additional comments. The thread at Lucia’s was good and fwiw her comment no. 15545 makes sense to me.

• ### Carricksaid

Curious, I think Lucia’s thread is one of the best ones I’ve seen on it. Good feedback by PaulM & Jean S among others, and of course Lucia’s study of the problems with Mann’s algorithm. I agree Lucia’s #15545 is spot on:

It makes a heck of a lot more sense to simply admit you can’t smooth the data near the end points and avoid trying to decieve yourself into thinking you can give an endpoint treatment a fancy name and get decent smoothed information around the endpoints.
Thinking that Mann (or someone else) has come up with a solution for the issue is exactly what is bogus.
It would have been much better if Mann (and others) were to simply write a paper decreeing that their smooth lines should just stop n/2 prior to the end point.

Jorge’s comments are good as usual. See #15661:

The only honest thing you can do is what Lucia suggested and show that part of the plot with dashed lines. If you want to be scrupulously honest you will add a footnote warning that the dashed area is “subject to revision”.

I’ve done the “dashed line” version in a few informal presentations (but not for a publication).

It’s my impression every generation of people entering data analysis reinvents the end-point reflection technique if it wasn’t already introduced to them, finds the problems with it, then learns later other people use it. (I had implemented it in the mid-1980s, then quickly discarded it.) Same for diddling with variable numbers of points in the filter as you approach the end points.

I find it mildly creepy that it gets used in the guts of a “blackbox” algorithm like filtfilt.

I find myself drawn to real forecast models like Harvey’s method I linked to above as a much better alternative to ad hoc forecasting methods like the end-point reflection technique.

• ### Nick Stokessaid

Carrick,
One of the things I did think a bit odd about Mann’s papers was that GRL was publishing them. I thought they were quite good, but as you say it’s old stuff that lots of people had dabbled with – it’s in the guts of Matlab etc. However, as a mathematician who deals with all kinds of literature, to find people impressed with unoriginal math is not uncommon.

But the odder thing is the response. Again, these papers are just describing a general sound mathematical technique, maybe overrated, that’s in Matlab, recommended by Aust Bureau of Stats, etc. It’s really nothing to do with climate science. Yet for some reason Jeff finds something sinister about a simple exposition by Mann in an email. And Lucia says it’s full of bogosity. CA had lots of threads saying, well, something.

Mann really gets people going.

• ### Jeff Condonsaid

“Mann really gets people going.”

Because he is constantly tweaking the math to show global warming in the most extreme light. Hell, the CPS and MV regression hockeysticks are completely bogus, not to mention Steig et Mann in the Antarctic. How about his regression-doesn’t-cause-variance-loss nonsense. I have reviewed a fair number of Mann papers in full detail, when you see Mannian math, you can almost always find an unscientific bias.

Describing this is a “simple exposition” is an unwillingness to see reality. The description of this filtering method by Mann was for use on a temp series, which less extreme scientists claim will have decadal scale downturns, yet don’t want those little details showing up on their blades.

• ### Carricksaid

Nick, I agree that Mann does get people in a tizzy sometimes. But truthfully, I’m not genuinely not interested in the Mann angle. I’m approaching this more as a research topic. “Somebody claims XXX and when is what he saying true.”

I’d do the same if Richard Feynman were still writing. It’s not anything personal here, at least on my part.

I know much better approaches than he advocated here. If I started a blog, it would be technique oriented, rather than personality oriented. That’s what I like about your blog.

• ### Nick Stokessaid

Carrick,
Here’s a very rough trend error analysis of MRC. Simple noise process, unit timesteps, unit noise amplitude. Should have trend zero. Typical noise value at end, +1 (or -1). Details skipped here.

That amplifies to a jump of 2 on reflection. So the error introduced by reflection into trend reported will be about the trend of 2*stepfn.

Unsmoothed, that trend would be a delta fn 2. Smoothed, it is of order 2/N (central value), where N is the length of the filter. Without MRC the two-sided estimate of trend noise would be somewhat less.

That’s an average outcome; the endpoint treatment can’t be expected to actually improve noise performance. That part depends on a single realisation value of the noise, so I can’t see that it’s frequency dependent (for equal amplitude).

So the endpoint cost is to roughly double the noise of the trend, but “preserve the trend” of the underlying.

ps didn’t like the CA post. It’s big on the “silkworm plot” but doesn’t note that you’d get a silkworm with two-sided trend estimates too. Maybe half as hairy, but still, I’d like to see a one-sided filter that does better.

• ### Carricksaid

Nick, most of my comments relate to end point reflection method. Here is feedback on the tringular smoothing algorithm.

As I indicated, there are places it’s useful. FILTFILT uses it, for example, but most people still know to ignore the behavior near the boundaries.

The big problem I have with it, is the difficulty in quantifying the error associated with this “forecast method”. Niche Modeling had a bit on this (though the figure is missing).

Regarding Mann’s smoothing algorithm, I’m not a big fan of reducing the filter size as you approach the boundaries.

I also don’t see much value in a triangular window here. If you have a significant high-frequency component and need to go away from the simplicity of the rectangular window, I’d use a different window function than the triangular one. My preference if I’m stuck with this for some reason is either Welch or Hann (raised cosine, with the proviso you extend the size of the filter by “2” so the values at the end points aren’t zero, this matters if you are going to use Mann’s “narrowing filter” near the boundaries).

For an application like this (smoothing), I’d just use an acausal Butterworth filter in place of the triangular filter. This way, I throw away fewer points near the end.

I wonder if you still think a centered triangular filter is an example of a causal filter?

• ### Carricksaid

Wish there was an edit function. The first two pars should read:

Nick, most of my comments relate to end point reflection method. As I indicated, there are places it’s useful. FILTFILT uses it, for example, but most people still know to ignore the behavior near the boundaries.

Here is feedback on the tringular smoothing algorithm.

6. ### omanuelsaid

Jeff, today’s world leaders are frightened cowards, like the Wizard of Oz, facing this harsh reality with an army of fraudulent scientific advisors.

a.) Limited fossil fuels
b.) Increasing world population,
c.) Increasing demands for energy per person, and
d.) A corrupt federal “scientific-technological elite” that hid the source of energy that destroyed Hiroshima and Nagasaki in Aug 1945.

https://chiefio.wordpress.com/2013/03/31/imf-carbon-dreams/#comment-49338

7. ### Matthew Wsaid

“Fortunately the math which Willis discusses this time, is relatively lightweight stuff”

If you and your engineer buddies are talking shop, then it is ” relatively lightweight stuff”, but not for some of us !!

8. ### jim2said

Jeff – I have a question. Does an n-order polynomial fit have any end-point problems, n being 2 or 3 or so? If not, why isn’t that the peferred method?

• ### Jeff Condonsaid

Jim,

As Nick points out above, all filters assume things about the end points. Polynomial fits make sense in different situations, for instance if you expect an exponential decay right. On the negative side polynomial fits don’t have a fixed frequency response, they can provide artificial stiffness where the fit isn’t good, data points are not equally weighted and they do whatever they want outside of the range of data. The result is that you often don’t really know what assumption is being made.

There are numerous examples in the emails of “scientists” looking for ways to present both temperature and proxy data with an increased uptick. I believe I recall a Trenberth quote about deletion of recent temp points for public presentation. This one by Mann is just another example of activism hidden in “science”.

• ### jim2said

The uptick point is valid, but there is also the problem that just about any filtering techique reduces the extremes. If they then tack on a non-filtered instrumental temperature, the instrumental one already has an “advantage” over the filtered proxy. Also, I think there are good reasons why tree rings, for example, are apples and the instrumental record oranges.

• ### Mark Tsaid

Well… All filters assume things, though not necessarily about endpoints in particular. There are plenty of filtering methods that can run right up to the end of a record without caring about what it looks like. This is particularly important for tracking problems (e.g. radar).

Mark

9. ### grumpydeniersaid

Reblogged this on grumpydenier and commented:
Ah, models. Once Jean Shrimpton retired, I lost interest. I can’t believe people still put any faith in this stuff any more.

10. ### Alan D McIntiresaid

William Briggs addressed this back in 2008

http://wmbriggs.com/blog/?p=195

“Now I’m going to tell you the great truth of time series analysis. Ready? Unless the data is measured with error, you never, ever, for no reason, under no threat, SMOOTH the series! And if for some bizarre reason you do smooth it, you absolutely on pain of death do NOT use the smoothed series as input for other analyses! If the data is measured with error, you might attempt to model it (which means smooth it) in an attempt to estimate the measurement error, but even in these rare cases you have to have an outside (the learned word is “exogenous”) estimate of that error, that is, one not based on your current data.

If, in a moment of insanity, you do smooth time series data and you do use it as input to other analyses, you dramatically increase the probability of fooling yourself! This is because smoothing induces spurious signals—signals that look real to other analytical methods. No matter what you will be too certain of your final results! …..”

11. ### jim2said

I would feel better about climate scientist’s ability to predict the future if they did a better job not smoothing over the past.

12. ### Kon Dealersaid

Nick Stoke’s idea of smoothing would appear to be to try and smooth out the irregularities of climate “psience”

13. ### huntersaid

WHERE IN THE @#%\$#^\$!!! IS CG3?
It has been more than long enough. Good, bad or indifferent, get the data out in the public sector where it belongs.
It is astonishing to me that not ONE skeptical blogger has simply posted the CG3 release.
And it is more astonishing that it is not being spoken of.
This smacks of everything that skeptics have been rightfully upset about regarding the AGW promoters.

• ### Jeff Condonsaid

Sorry Hunter. So far my reading has nothing earth shattering in the emails. It is actually quite difficult to parse 220,000 of them though. I called my buddy Assange and he doesn’t want to touch it either.

• ### huntersaid

Jeff,
lol regarding Assange. Crowd sourcing CG3 is the best option. In all seriousness, it needs to be done.
Break it up into a PM committee and sub committees for a few thousand e-mails. Many will link badly, but even if it statred arbitrarily in divisions, it will be some progress.
Mosher was on to some sort of spook-lite sorting at one time. Any report from him?

• ### Matthew Wsaid

Maybe all the best stuff came out in I & II.

14. ### stansaid

Jeff,

Tom Nelson had this story about Prof Easterbrook testifying about declining temps and how the 30s were hotter. The Dems in the state senate don’t believe that the scientists would adjust temps. http://www.energycentral.com/functional/news/news_detail.cfm?did=28055120

Just shows that even when they admit that the data has been adjusted, the sheep don’t understand. All they know is what they get in the press releases.

15. ### Doomminysaingsaid

payday loans online payday lottery canada http://www.payday-money-online.net payday candy mix [url=http://www.payday-money-online.net]www.payday-money-online.net[/url] project payday comments