Parsing Emails

So I spent several hours today writing scripts which parse the emails.   I was hoping for continuations of some of the more interesting conversations we are familiar with but so far have found little more than a group of advocates for catastrophic climate change, doing what they do.  They fully believe that the fact that proxy data doesn’t match temperature, in no way calls into question the randomly selected proxy data. Some question whether it is it ok to paste data on the end of a series.  Still there are others who advocates of more study, stating that the “act-now” advocates are not honest scientists.   Again, I’m reminded of the organized and funded attacks against anyone who notices the problems with their work.  It is really shocking to read how they followed through with attacks against those who don’t fall in line.  Mann in particular, is thin skinned and his angry attacks on other advocates not pushing his version of history, pressure those with little backbone to play both sides of the fence.

If you want the meaning of the emails, you have to be able to read and CG1 and 2 have most everything we need to know in them so far.   Beyond a three word “hide the decline”, the average public has no interest.  So far, I have found no new pithy quote with the kind of clarity that CG1 revealed. I did find a large number of emails which we have covered in topic before.   Some have new replies but I’ve noticed nothing which was tremendously interesting.

There were so many nuances in these emails.  Remember this email from Michael Mann (my bold):

Date: Tue, 14 Oct 2003 17:08:49 -0400
Subject: Re: smoothing
Bcc: >
correction ‘1)’ should read:
‘1) minimum norm: sets padded values equal to mean of available data beyond the
available data (often the default constraint in smoothing routines)’
sorry for the confusion,
At 05:05 PM 10/14/2003 -0400, Michael E. Mann wrote:

Dear All,
To those I thought might be interested, I’ve provided an example for discussion of
smoothing conventions.  Its based on a simple matlab script which I’ve written (and
attached) that uses any one of 3 possible boundary constraints [minimum norm, minimum
slope, and minimum roughness] on the ‘late’ end of a time series (it uses the default
‘minimum norm’ constraint on the ‘early’ end of the series). Warming: you needs some
matlab toolboxes for this to run…
The routines uses a simple butterworth lowpass filter, and applies the 3 lowest order
constraints in the following way:
1) minimum norm: sets mean equal to zero beyond the available data (often the default
constraint in smoothing routines)
2) minimum slope: reflects the data in x (but not y) after the last available data
point. This tends to impose a local minimum or maximum at the edge of the data.
3) minimum roughness: reflects the data in both x and y (the latter w.r.t. to the y
value of the last available data point) after the last available data point. This tends
     to impose a point of inflection at the edge of the data—this is most likely to
     preserve a trend late in the series and is mathematically similar, though not identical,
     to the more ad hoc approach of padding the series with a continuation of the trend over
     the past 1/2 filter width.
The routine returns the mean square error of the smooth with respect to the raw data. It
is reasonable to argue that the minimum mse solution is the preferable one.  In the
particular example I have chosen (attached), a 40 year lowpass filtering of the CRU NH
annual mean series 1856-2003, the preference is indicated for the “minimum roughness”
solution as indicated in the plot (though the minimum slope solution is a close 2nd)…
By the way, you may notice that the smooth is effected beyond a single filter width of
the boundary. That’s because of spectral leakage, which is unavoidable (though minimized
by e.g. multiple-taper methods).
I’m hoping this provides some food for thought/discussion, esp. for purposes of IPCC

After reading from these same people, how well funded “right-wing” skeptics with ties to industry are so biased, to read that reflection of a trend at the end of  a hockey stick “might” be proper science is a little difficult to swallow.  Don’t forget that this is a 2003 email, and we now know that temps have stayed relatively flat since then.    The reflection Mr. Mann proposed, is therefore ad-hoc, and can now be proven inaccurate.

In the end, today’s reading was 99.9 percent review of just how loose a game is being played.  It shouldn’t be overlooked that the purpose of the enzyte filter Dr. Mike proposed is for publishing in the premier global warming report of all time.



69 thoughts on “Parsing Emails

  1. I’m not sure what your complaint is there, Jeff. he’s talking about smoothing to the end of the interval. Preserving a trend is exactly what you should want to do. Do you think it should be altered? To what?

    He’s just listing the virtues of his method, exactly as he did in his published paper on it.

    1. ” Preserving a trend is exactly what you should want to do.”

      Not true Nick. The temperature system of the planet is a thermal mass. The assumption that the trend will continue is a prediction of future climate. That assumption a-priori incorporates all kinds of model performance expectations, which since 2003 (conveniently for me), didn’t actually come true. My point is that there is a lot of this blinder thought in these emails, and the scientists and advocates are quietly discussing whether these trend tweaking assumptions are reasonable. I would run a window filter of any distribution you like, right to the end of the graph and instead of padding with zero’s, would simply use NA so the frequency response would increase a little at the end but the actual points would be creating the data. Frequency response of a filter at the end is not a problem because temp curves aren’t for launching rockets into a micro-sized orbital window, they are to show what the temperature has already done.

      Divergence of treering data is another example. Ed Cook is fully willing to admit that they know nothing (or very little) about historic climate, yet when faced with the obvious truth that treerings aren’t matching temp, none are willing to recognize the high possibility that they are performing nothing but high frequency wiggle matching.

      The hockey stick McIntyre is working with now is a third example. Data was shifted such that the average popped up when certain series dropped out. The series themselves are flatly ridiculous, the spike at the end is an artifact of stupidity, and the geinusies simply ignore the silly thing and state that it is likely “not robust”. Complete advocacy disguised as science.

      1. Jeff,
        Mann is talking about a general time series smoothing technique, not recent temperatures. And it’s not a prediction. To smooth anything, you make an idealized assumption about a form that it is currently following. Usually that it is some form of polynomial. So you fit the coefficients and use the fitted curve as the smooth. This has been around for centuries.

        Mann’s reflection technique is just a neat way of doing the algebra. The end result is still an interpolation using the value of known points.

        His talk of preserving the trend is making a contrast with the next order down – minimum slope. This works on a symmetry that the slope at the end is zero. Often that is quite unreasonable. It would give a wrong prediction for points that actually lie exactly on a straight line. Mann would get that right.

        1. I don’t usually disagree with you like that Nick. I am of course familiar with the x-y reflection technique, as are you, and it is about as basic as they come for end-point methods. It is quite well known and even was standard on some basic software 20 years ago. Nothing “neat” about it.

          It does make the assumption that the trend will continue so it is an obvious prediction. Why you say it isn’t is beyond my understanding. It provides a maximal high point on the end of an uptrend and if the data is truncated at the right point you give the impression that the trend is launching upward. In the timeperiod of 2003, temps had a little downhook which was driving them wild. Wigley even discussed removing the endpoints from the graph to give the right impression to the public.

          It is just one more baby step in the unprecedented advocacy direction.

          You say that a zero slope assumption is often unreasonable. This is a specific situation we are discussing though, one of thermal mass where short term deviations from the alleged climate trend are quite common. The only purpose for reflecting the end in this case is the same one we constantly run into in paleo-climology.

          1. Jeff,
            ” This is a specific situation we are discussing though, one of thermal mass where short term deviations from the alleged climate trend are quite common.”

            Well, you may be, but I can’t see that Mann is. His discussion up to and including the bolded bit is perfectly general mathematics. He later uses the temp series as an example, but only to show that MRE gives the lowest sum squares.

            He was obviously in the process of writing this paper, where Fig 1 is the example he cites, saying almost exactly the same things about it.

            The IPCC reference is probably a reference to a decision to, in Chap 3 I think, nominate a standard smoothing process for that chapter. There may have been a thought that it should be used generally. That smoother was a minimum slope, and Mann doesn’t like it. Neither do I. Insisting that all data be pushed to show a zero slope at the end seems absurd. I once showed at CA (now gone) what that did to the CO2 curve. Again, preserving the trend is just the right thing to do.

            But aside from the maths etc, I don’t see where CG3 comes in. It seems to be another case where you’ve found a scientist discussing in an email what he is in fact going to say in a published paper. Why not refer to the paper?

          2. He used the temp series which I expected he had used? hmm…
            If we continue plotting measured data on those old 2004 series, did it match his suggested filter? no?? hmm.
            Did his suggested filter show a higher and unprecedented or lower trend? hmm

            I do agree that it was a person who holds themselves to be a scientist and did write the paper but it seems to me that he has done exactly as I had critiqued. It also seems to me that in the temperature case, my simple method would have given more accurate results, as would the zero slope method – which I don’t like for temp or CO2.

            What happened to your CA link on the CO2 curve? That would be the correct curve to use this xy reflection method on. It has a low noise level and a definite trend with a known and continuing driver.

          3. Jeff,
            You’re still evaluating in terms of predictions. Mann is smoothing.

            Every linear op of this kind – smoothing, integrating, differentiating etc – can be seen as a projection onto a finite set of basis functions (eg polynomials order<n) and then getting the result in the basis space. The effect of the x-y reflection is to restrict to a set of odd functions about the endpoint. In poly terms, (1,x,x^3,x^5…). The min slope restricts to (1,x^2,x^4 etc). That's all. You just project within the range on this set.

            My CO2 plot went at the CA CG1 crisis – pictures weren't allowed after that, and earlier pictures vanished.

          4. Filters have assumptions which are functionally predictions of future performance at the time series endpoints. Thus, mirroring, zero slope etc… Are you saying that you don’t understand them to be predictions?

            Regarding the loss of the images, is this something Steve did on purpose – to remove your result (which sounds to be common sense) and deny images after that point?

          5. Jeff,
            CA just moved to a new server and didn’t transfer the commenter’s pics (which would have been a big job). It wasn’t just me.

            As to prediction, well, the basis functions do continue, so they are capable of giving values for ever. But using them for smoothing doesn’t involve a claim that they work outside the range.

            It’s like fitting a Fourier series. You get a good and useful fit. It’s actually periodic outside the range, which is a prediction, but it’s usually not one you assert.

          6. It looks like my reply to your plot was the same as my reply this time.

            The prediction isn’t that the functions MAY be continued, the prediction is how those projected future points affect the existing values at the endpoints and thus how the existing values are offset. Padding of any style projects future information onto existing values — thus prediction.

            It is odd to me that we are having this conversation.

          7. Nick Stokes:

            As to prediction, well, the basis functions do continue, so they are capable of giving values for ever. But using them for smoothing doesn’t involve a claim that they work outside the range.

            I think you are mistake here.

            As you get close to the end points, that does involve some prediction/forecasting/backcasting. That or you get progressively less smoothing. With Mann’s method he does have to fill in past the last point he wants a smoothed curve to.

          8. It seems lately every time I’ve sat down to write a comment, one of my sons has come in my room and started talking. I assure people I’m not losing my mind. I meant to say “I think you are mistaken here.”

          9. Carrick, Butterworth…
            Yes. It’s a symmetric filter and at the end the reflected data is odd, with endpoint as origin. So the final point is the last data point.

          10. Then hopefully you see why I’m saying

            As you get close to the end points, that does involve some prediction/forecasting/backcasting,

            as “the forecast” here corresponds to the x-y reflected data. If you scrunched down the filter so you never went past the end point, as with your explanation

            Yes, you get progressively less smoothing. At the end, you have none – the endpoint is pinned.

            there wouldn’t be a need to discuss x-y reflection.

            I believe that Mann implements his recursive Butterworth using Matlab’s FILTFILT routine (that’s a link to code btw), which is why I said seeing his script would be informative.

            FILTFILT very clearly uses x-y reflection as you can see from the code, and recognizes that this is a form of extrapolation:

            % Extrapolate beginning and end of data sequence using a "reflection
            % method". Slopes of original and extrapolated sequences match at
            % the end points.
            % This reduces end effects.
            y = [2*x(1)-x((nfact+1):-1:2);x;2*x(len)-x((len-1):-1:len-nfact)];

            % filter, reverse data, filter again, and reverse data again
            y = filter(b,a,y,[zi*y(1)]);
            y = y(length(y):-1:1);
            y = filter(b,a,y,[zi*y(1)]);
            y = y(length(y):-1:1);

            % remove extrapolated pieces of y
            y([1:nfact len+nfact+(1:nfact)]) = [];

            if m == 1
            y = y.'; % convert back to row if necessary

            X-Y reflection may make the data look visually more “pleasing”, but it doesn’t improve the accuracy of the smoothed data near the end points.

            You are probably aware there are R-langauge implementations of Kalman-filter based smoothers (see the Harvey-Shepard link below for a better technical explanation of the method). I believe this would be a better choice ofof algorithms for smoothing data to near the end points.

          11. Regarding my other comments, here’s a simple example of a linear trend + gaussian white noise:


            The blue curve is the original data, the red curve is the x-y reflected version ala FILTFILT.

            You can very clearly see the artifact that gets introduced by using the end points to perform the xy reflection, which occurs when there is a non-zero trend and poor SNR (typical of the data that Mann works with).

            FILTFILT works better IMO when you have trend-less quasi-periodic data, and when the SNR is fairly large.

        2. Nick, I think Jeff is right about this. Since the temperature signal in the data are low pass in character, using the reflection method will introduces a high-frequency response in the reconstruction that must be unphysical. This is similar to one of the problems we know must exist with Marcott’s reconstruction.

          1. I should mention that my statement depends on there being high-frequency noise present in the measurement. In Mann’s case, the positive trend is imposed by fitting signal + noise to a temperature measurement with signal + noise. Even though you have an expectation that the long-term trend will exhibit have a positive trend, if you force the signal+noise to follow this, you’ll end up amplifying the noise to enforce it.

          2. Sorry, Carrick, you’ve lost me there. But in any case Mann is applying a least squares test, and says that MRE does best. That seems to indicate that it does best with noise.

            In any case smoothing to an end-point is hard, and something has to give. A squeak in the noise doesn’t seem fatal.

          3. As usual, I try to stay focussed on the issues that I’m commenting on (i.e., I don’t see the relevance of the comments over Mann’s MRE here).

            Smoothing to end points is inherently noisy, except with periodic series, in which case it wasn’t really an end point. However, some methods aren’t just noisy, they actually amplify the noise.

            X-Y reflecting a series around an end-point to enforce a non-zero trend is a really bad way of doing it. It works fine for noiseless data, but then there is noise, you’ve just added a spike at the end point of the series equal to the noise present at that end point.

            Thus, you’ve just introduced an artifact into your data with your reflection method. By definition that artifact is “unphysical.”

            In cases where the SNR is high this doesn’t matter. The case here isn’t one of high SNR so it does.

            Not saying there aren’t workarounds, but it doesn’t appear Mann is aware of them.

          4. Actually it’s a bit worse than just a spike… I’m referring specifically to this statement “reflects the data in x (but not y) after the last available data
            point. ” This method actually adds an artifactual step in the trend: It introduces a noise equal to twice the noise in the measurement at the end point. Reflecting in x can also be very bad if “x” is time, or it is for us old crusty fizzlecists who still believe in an arrow of time.

            I did notice in re-reading Mann’s comment that he’s aware that people use trends instead. I don’t know why he thinks x-y reflection would be inherently better than say something like >Harvey and Shepard 1993. With is a statistical forecast tool, it seems like a much better tool to explore end points with that x-y reflection.

  2. I could imagine that this type of smoothing could be used on certain data series where the trend behaviour is constrained by a well understood process, but only if error bars were included.

    in the case of a temperature series, all three of these methods are 9i suspect) equally likely. Absent the assumption of a current forcing imbalance, the next 10 years of temp will a) trend down, b) trend flat, c) trend up. I fail to see how these 3 equal weighted options can add value to any extrapolation or model.

    I’ve played with REAL data too many times to get tricked into thinking I’ve seen a proper average, only to find that (as an example) my first 10 circuit samples were outliers, and the mean behaviour was so different that the circuit didn’t work.

    1. Always expect the worst. In my experience, young EE’s in particular fall into the trap, the good ones check. Everything is so clean and neat most of the time that I find myself asking them if they have “checked” the circuit once it has been built. I have a product launch coming up on a new board which is complex (for me) and I’m very concerned that something will crop up during launch. It has a 100kHz pwm signal at over 2 amps which I’ve done everything I can think of to minimize transmission from. There are several unique challenges which took quite a bit of effort to beat out. Thermal, electrical, optical, and software designs are being pushed hard on this one. The way they operate, there is no possibility that any of these sciency paleo guys could work half of it out. In our world, the thing has to work or the god of Physics will smite you!

  3. Thanks, Jeff, for all your efforts to unravel the 2009 Climategate mystery. The mystery began in 1946 with the publication of misinformation on the Sun’s internal composition and source of energy.

    The motivation: “Fear and loathing of the destructive nature of human’s who must be subjugated to prevent them from destroying the earth via using the forbidden knowledge of the Sun’s source of power, (E = hv) and (E = mc^2), neutron repulsion.”

    In 2001 while global climate temperatures were being adjusted to fit climate models, solar neutrinos started to oscillate to fit the consensus solar model in order to deny new empirical evidence of “The Sun’s origin, composition and source of energy”

    The participants in these frauds had absolutely no idea they were selling their own friends and family into servitude for a few grant dollars.

  4. Richard Lindzen gives a compelling account of how we got here.

    Click to access lindzen-on-climate-science-2010.pdf

    A key point:
    “When an issue becomes a vital part of a political agenda, as is the case with climate, then the politically desired position becomes a goal rather than a consequence of scientific research. . .In particular, we will show how political bodies act to control scientific institutions, how scientists adjust both data and even theory to accommodate politically correct positions, and how opposition to these positions is disposed of.”

  5. Quite frankly, I don’t understand the obsession with data smoothing that goes on in climate science. Unless one has a theory and a prediction of how the time series is supposed to behave, in which case one would fit the theory to the data, there is no justfiable reason for drawing a curve through a time series. In addition, smoothing throws away a lot of information. The fluctuations provide valuable information about the underlying physical processes through the power spectral density of the series. Compute the power spectrum and these theological disputes on handling end points would, thankfully, go away.

    There was once a paper in Physical Review Letters that showed the power spectra of measured temperature time series were different from the model outputs. Maybe that’s why the fluctuations are ignored.

  6. Re: Carrick (Mar 17 17:50),

    actually Mann is totally clueless about filtering in general. In his later works (2002 onwards) he’s been using zero-phase (filtfilt in Matlab) Butterworth filter with various paddings. The only reason for the use of that filter seems to be that it gives similar results than the “filtering” he used in MBH9X (Mike’s Nature trick).

    About four months ago Mann (or his grad student) made some restructuring of his web pages. As a result of this, some files (which have not been publicly available at least since 2004 when I first started collecting MBH9x related data) turned up to the web site. Among these files, there is a high resolution postscript figure of the MBH98 (also MBH99) with the smoothed curve in it. It is easy to extract a curve from a PS-file, so we got digital versions of the MBH smoothings (something which have not been available before). The only way previously to try to figure out exact details of the smoothing was to visually compare (as UC did) to a (especially in the case of MBH98) low resulution image graph. So I started together with UC a research in order to figure out exact parameter details of Mike’s Nature trick smoothing in order to conclusively prove that it was padded with direct instrumental data (it is) and not with mean or similar.

    I have to say it has been an interesting journey … we now know pretty much exactly how Mann did smoothing. All I say now is that it is much, much, much worse (in technical terms) what we have thought and shows that his understanding of filtering was on a level of an average undergrad student. I’m sure you, Jeff, Nick and others with technical understanding will be astonished to learn what was the smoothing in MBH9x…

    I had actually planned to make posts about these issues around these times, but I think there is plenty of other stuff over CA for while, so I’ll postpone that. Other than that, I’m rather busy right now anyhow.

    Jeff, can you locate an email, where Mann is revealing his Nature trick to Jones? I’m confident that if it is in the emails, it is in within half a year timeframe before the infamous email.

    1. Jean, that is astonishing. I have been working on parsing software which helps facilitate the kind of search you are asking for. I will get started tonight.

      1. Re: Jeff Condon (Mar 18 07:55),
        thanks. I’ve gone through CG1&2 emails, and I think it is not there. So that should further narrow the search. But it is well possible that the revelation is not in the emails; I think Mann&Jones met in a conference a month or two before.

      2. From my work on CG3 this weekend, I suspect I won’t find anything. I have sorted by who sent which email and have chased through all Mann and Jones emails on this topic this weekend. Often a message only exists by receipt though and that will take a bit more searching. I have also been working on creating links between emails for high probability conversations but until that is finished, I will be using my initial tools.

    2. Jean S, thanks for pointing out his code to me. I’ve not had real doubt about his numerical skills nor about his level of objectivity for a while now.

      Looking forward to your posts on this. Unfortunately there does seem to be some linkage between Mann (as a potential reviewer) and Marcott’s paper coming off the rails, so it would be apropos.

      Take this quote from Marcott’s paper:

      Our global temperature reconstruction for the past 1500 years is indistinguishable within uncertainty from the Mann et al. (2) reconstruction.

      Now, Nick has argued on ClimateAudit that this doesn’t say what it seems to say, namely that Marcott’s reconstruction is statistically indistinguishable from Mann’s, but at least the Nature magazine version (hat tip to mt) suggests that Marcott at least agrees with my interpretation:

      The temperature trends that the team identified for the past 2,000 years are statistically indistinguishable from results obtained by other researchers in a previous study, says Marcott. “That gives us confidence that the rest of our record is right too,” he adds.

      I suspect that in fact Marcott is statistically distinguishable (though I admit I haven’t tested it), but the more interesting question is: Why separate out Mann’s now somewhat dated reconstruction instead of more recent ones like e.g. Ljungqvist? Why single it out at all, instead of comparing against all other reconstructions?

      And why is there a sudden interest on Marcott’s part of the “uptick” at the end, which I think virtually everybody agrees now is spurious?

    3. I have spent a couple of solid hours reading emails, first by “from” and then the whole set. So far I haven’t found much other than a lot of fretting starting with Mann telling the boys that the divergent series should be relegated to a different section.

  7. The interesting thing about this email is that it shows Mann’s innumeracy.
    He says that ‘minimum roughness’ imposes a point of inflection, which is true, but what he fails to notice is that it forces the smoothed curve to pass through the final point of the series, giving that final point a vastly inflated importance.
    It’s a trivial exercise, though probably beyond the wit of Mann, to see that if you use a flat filter (OK I know they dont), the smooth at the n-1th point gets zero weighting from the n-1th data point! (And a very large weighting from the nth point).

    There is a marvellous piece of Mannian BS in email 1370 about this:
    “This is the preferable constraint for non-stationary mean processes, and we are, I assert, on very solid ground (preferable ground in fact) in employing this boundary constraint for series with trends…”

    1. Re: Paul Matthews (Mar 18 09:24),
      yes, the “roughness BS” etc comes from his early work (~1993) with Park. I’ve now gone through all of his publications and published code at least twice…

      Promises are made to be broken ;), and I’m too tempted to give some funnies to you guys, so here is the best part of MBH9x smoothing (of course as always with Mann, there are dozens of other complications, it wouldn’t take months to resolve it otherwise): check carefully the filter part (written by Mann) in this code (it got the ball rolling for me):

      1. Yikes! Now that’s what any sane person would call unreadable, undebuggable code. The nested if statements with the GOTO statements are mind boggling all on their own. You’re a braver man than I am to touch this dreck.

        1. Re: Paul Linsay (Mar 18 12:10),
          yes, in past couple of months it would have been convinient if I could have compilied any of his code…

          I’ll save some trouble from y’all: determine what is the filter length in the “lowpass” code 😉

          1. Jean S,

            If I’m following the breadcrumbs correctly, he calls lowpass() which in turn calls spfird to perform the actual filter. Parts of this code (but it’s unclear which parts) are attributed to Stearns and David, which uses MATLAB, but aspects of the code looks like to it’s a transliterated version of C code (the use of zero based indices), since FOTRAN and MATLAB use the same array conventions.

            It looks like he ends up passing the number of data points to spfird here (so that’s the short answer):

            do i2=0,nscan-1
            end do
            N = i2
            LL = N-1
            call spfird(N-1,iband,f1,f2,iwndo,b,ierror)

            “nscan” (I’m assuming) is the number of points read in. He hands off this value in a rather awkward fashion:

            i2 exits the do loop with the value “nscan”, which is then assigned to “N”, before being passed to spfird as N-1.

            spfird even though it’s using one-based arrays, appears to expect the largest addressable element minus 1.

            The loop for the low-pass looks like this
            do 6 i=0,lim
            6 continue
            if(mid.eq.1) b(l/2)=wc1/pi

            “l” should be “N-1”, “mid” is a flag that gets set if there is an odd number of points in the filter (the code for setting this up is a just a bit unwieldy), but other than that, this is a pretty standard tapered Fourier Series based filter design.

            This code as written is very “fragile”. The use of “ell” as a variable name is dangerous of course, but there numerous places where he’s using an intermediate product in place of the original passed-in quantity, as well as a number of other poor software practices.

            Normally, I’d compile the code with the “-g” option and run it in gdb and step through it to see what actually happens.

  8. Warming: you needs some matlab toolboxes for this to run…

    Is there something Freudian in Mann’s spelling error in the first word?

  9. Re: Carrick (Mar 18 12:59),

    is the number of points read in. He hands off this value

    Yes, his FIR filter is of the same length as the signal to be smoothed!!! I still don’t know if it should be called a “filter” as filter is, by the definition, a time-invariant system. Of course, you can think it as a standard filter with a enourmous zero-padding in the signal part (which essentially Mann’s brute-force implementation of the convolution is essentially doing). Anyhow, thats’s the basic “algorithm” that was used in smoothing the infamous curves in MBH9x … of course, how it was actually implemented is a different story…

    Jeff, while you reading the emails, notice that Mann never (at least in CG1&2) explained his MBH filter parameters to anyone, although that would have been very convinient in some occations (instead he is offering the ready smoothed values to be downloaded from a ftp site). I think IPCC TAR figure ended up being handled so, that Mann sent his already smoothed values to a person (Ian) who then added the rest of the curves using Mann’s program/instructions (apart from CG emails there is now a file (IPCC TAR figure) whose time stamp etc indicate that).

    1. Jean, would the result of such a filter tend to compress variance to the mean? And if true, wouldn’t such make Steve’s paper about picking hockey sticks even more true?

      1. It is hard to characterize what such a filter is doing as it can not be described by the standard digital filter theory (that’s why it is so ridiculous). I think it is best described as thinking it as a standard FIR filter with zero padding. So it means that the filter is extremely good (as it is so long) in the middle part of the signal but near the ends its behavior becomes more and more dominated by the zero padding.

        Anyhow, this filter is only used in the MBH9x graphics for smoothing the final end product, so it has nothing to do how the actual reconstruction is done. In fact, it was hard to find any instances where Mann had used any type of smoothing prior to this in MBH98. I was interested in knowing what filtering/smoothing operation was used in those graphics as that is the only way to be sure what padding was used in MBH9x smoothed curves (Mikes Nature Trick).

  10. Jean S, I imagine part of this is just differences in our respective lingos.

    My usage would follow pretty closely that of wikipedia.

    I do have thingies I’d call “filters” that aren’t time invariant for example (e.g., a filter that tracks the center frequency of an otherwise narrowband signal whose center frequency is time varying).

    The code looks like something Mann might have written while he was a grad student. Given the proximity of Stearns and David (1996) to when he started working on proxy research (1996) that’s probably not a coincidence.

    1. I don’t know what you mean by “part of this”, but what comes to lingos, I used the term “filter” above as a synonym to “digital filter” as also described in wikipedia. That’s simply because FIR filters (as done by Mann and Stearns&David) falls under that category. Of couse I’m aware that there are also non-linear and time-varying filters.

      Yes, the code is from 1995, and the code snips are from Stearns and David (1993). They are also available here (Mann did not appear to have changed anything in their code).

      Anyhow, have you ever seen such a FIR filter design (filter length=signal length)? I had not, nor none of the signal processing experts I’ve talk about it.

      1. Jean S, by “part of this” I was referring to your language choice

        I still don’t know if it should be called a “filter” as filter is, by the definition, a time-invariant system.

        In my experience, it’s not unusual to implement a FIR in the frequency domain and to use the entire time wave form when doing so.

        Thanks for the link to Stearns and David (1993).

        This is poorly written code for something that other people are going to be using. Wow.

        (It’s also odd that Mann didn’t preserve the copyright statements… you’re supposed to.)

        1. To be clear (sorry time-pressured here), you seemed to suggest that in your lexicon, filters were always time-invariant.

          Reading back though your comments, I see now you probably meant FIR filters.

          1. Re: Carrick (Mar 18 15:57),
            No, I meant digital filters as in Oppenheim & Schafer (from where I first learnt them), or as in Mitra (2001) which happens to be on my table. There “filter” is always time-invariant. The same applies probably also to Stearns and David.

          2. Now I’m confused. You just agreed that filters need not be time invariant:

            Of couse I’m aware that there are also non-linear and time-varying filters.

            I’m sure both can’t be right. 😉

            In any case, as the term “digital filter” gets used in the DSP community, there is no requirement that for it to be a “filter” that it be time invariant. Again I will refer to Wikipedia, not as a source of a authority, but for a description that matches my own parlance.

            There are classes of algorithms for construction of digital filters that require it to be time invariant. That’s different of course than saying digital filters must be time invariant.

          3. Re: Carrick (Mar 18 16:54),
            no, you are not confused, you are just playing the silly language game Nick is the master of. No wonder you get along so well.

            I know there is not even slightest reason I need to explain these to you, but I do it for the benefit of not so technical readers. The term “filter” originates from certain engineering fields (signal processing, systems and control theory, …), and orginally (40+ years ago) it refererred exclusively to the linear, time-invariant systems. Since then also some non-linear, time-varying “filters” have been developed (althoug they do not have similar, unifying theory as the LTI systems), but it is still the standard practice, as Carrick is well aware, to refer by “filter” (without any additional adjectives/knowledge) to a LTI system and if it departs from that, then add a descriptive such as “non-linear filter”. It this case, there should not even be a slightest possibility of misunderstanding as Mann’s design was in the context of FIR filters, which are a subclass of LTI systems.

          4. Jean S, I’m not playing any games here actually.

            What I was trying to tell you is your term, as you’ve learned it, is overly narrow as it relates to how this term is used in the field of digital signal processing. Which is a polite way of saying “you’re mistaken,” the way you’re using this phrase is too narrow.

            How a term might originate in science often has nothing to do with current usage, hence the comment on parlance. There is indeed a quite healthy field of “time varying digital filters”. If you google this phrase, you’ll find plenty of hits.

            You may choose (and those of dated references as well) to frame digital filters in other terms, but they have come to mean something more general than linear, time invariant ones.

            “Digital” communicates nothing about whether a filter has any of these properties. It merely implies the filter is designed to operate on on a discrete rather than continuous data series. You can argue that it carries with it other properties, but the word itself does not have that connotation, nor do other people in general in the field of digital signal processing agree with you that it does. (I’m sure that you can find plenty of people who do, like those who only work with LTI digital filters).

            Were “digital” to automagically also imply these other properties, for example, it would not have been necessary for the author of the Wikipedia to use the phrase ” linear, time-invariant, digital filter”.

            While I can find many text books that give other explanations, I have no problem with them using the term “digital filter” that way, as long as they clearly define what they mean by it. But that doesn’t mean their use agrees with the use of other people in the field.

            Where I in Jean S’s position, and wanted to insist that “digital filter” only implied linear and time-invariant, even then, in a manuscript, I would still spell out that there are properties that I expected “digital filters” to have.

            In any case, even if the design for a filter is time-invariant, it can be the case that the implementation is not truly time invariant. MATLAB’s FFTFILT algorithm is an example of that. The fact that the implementation isn’t truly time invariant has no particular implications that I’m aware of, with respect to its application.

            If Jean S wants to relate this all to Mann, he should do so, but should be unsurprised that other people are less interested in dealing with the mistakes in a 1998 paper written by a student/young post-doc with a limited understanding of signal processing.

            I think this is enough of this for me. When people start getting surly, it’s time to move on.

        2. Re: CarrickCarrick (Mar 18 15:50),
          I pretty much know how to design a FIR filter both in time and frequncy domain (and due to this last two months I also know almost line-by-line the differences in various implementations in Matlab, Octave, and R), but could you clarify this:

          to use the entire time wave form when doing so

          Are you saying, that for a signal, let’s say, of length 1000, it is not unusual to design a FIR filter of length 1000?

          1. If you implement them in the frequency domain, there’s no particular strength in not using the entire signal duration, other than the usual problems with end effects. For time domain filters, there is an issue with not using a fixed window size when comparing different data sets, but I’m sure you’re aware of why that is.

          2. Another reason, which meant to include, is very large data sets. However, this is an practical issue, not a fundamental one. (And one that’s not usually present in paleoclimatology.)

          3. Re: Carrick (Mar 18 16:56),
            of course, you are not answering my direct question as you are well aware that no one (except now Mann) is designing such FIR filters but instead you give some “advanced sounding” chitchat. As you perfectly know, there is not a slightest (theoretical) difference if the filters are designed in time or frequency domain.

            I’ve never understood motivation for this style of “discussion” from a highly intelligent person. You are not bluffing anyone knowledgeable but you may confuse those who are not. Anyhow, from the past experience, I should have known better. From now on, you are on the same very short list in my book Nick Stokes has been on for years.

            Again for the benefit of those not so technically advanced readers, filtering a signal of length x with a FIR filter of length x is the same thing as applying the x-point moving average to the signal of length x: you get a single value (the middle point) and if you want to extend beyong that you are dealing with the “usual problems with end effects” as Carrick puts it. In the case of Mann, he filtered a signal of length 998 (MBH99) with a FIR filter of length 998.

          4. Jean S, I actually find this coy method of yours of not asking what you’re trying to ask and expecting other people guess correctly a bit obnoxious.

            I answered this question:

            Are you saying, that for a signal, let’s say, of length 1000, it is not unusual to design a FIR filter of length 1000?

            Had you wanted me to comment specially on Mann’s case, perhaps you should have framed the question differently, by starting with the technical issues you were concerned about.

            Mann’s clearly misapplying an algorithm (something I find unsurprising at this point), and he’s applying it to data that presumably have a trend, possibly an offset and low SNR, so there are likely to be consequences. Hopefully you can show what those would be.

            Had you wanted to know whether I would use a time domain FIR with length N to fit a data set of length N, again, no. Generally I use a smaller window size, overlapping FFT filter design, then through out the first and last window of data.

  11. Great discussion. Scientific details and the scientific record matter.

    Jeff, you may want to keep a lookout for any details on the details of Mann’s calculations of confidence intervals in MBH 9x and PNAS 2008. Jean S – any progress from your investigations? (and thanks for all your work)

    I’ve asked the question again (see the end of the post at ) but dispair at getting a direct response.

    1. Re: Geoff (Mar 18 21:26),
      Geoff, actually somewhat yes. I’m now pretty sure how Mann calculated his CIs in MBH99 (we have known for years how it was done in MBH98, that’s no mystery). Unfortunately, it appears the calculation involves one or two of steps which were likely performed “manually” and are not documented anywhere (I seriously doubt that the information is given in any CG emails). So the only reasonable way to decode it, would be to try to replicate it with the tools he used. Over the years I’ve spent already so much time on that issue, that I think I’m not up to the task. If someone is willing to try, I think the feasible approach is to try to replicate “IGNORE THESE COLUMNS” with his multitaper program. I think the code is included in this program, and the additional data (residuals) needed are available in CG1 files.

      1. Jean S, many thanks for the update. If skeptics were really well organized and funded someone would have done this already, but the volunteers have done yeoman work and cannot reasonably be asked to do more. I may see if I can interest someone in the task who could shed some light. By the way, I think the code for Mann’s PNAS 2008 paper does not cover the calculation of confidence intervals.

  12. It also allows you to capture & store videos
    while you are out for a tour or picnic with friends. Uncooked
    ginkgo biloba seeds have shown some small levels of toxins,
    so if you prefer the seeds it is advised to cook them prior to ingestion.
    Customers seeing those two price points might hesitate to buy Microsoft’s option.

  13. Superb post but I was wanting to know if you could write a litte more on this topic?

    I’d be very grateful if you could elaborate a little bit more. Cheers!

  14. This slim and light tablet comes with specifications that provides
    high computing power. This gadget is one of the best buys around if you are looking
    for the cheapest Internet tablets. Aided by the Pills rapidly getting best tablet pc all of the direct device when using the iphone, it
    is really distinct that we should expect to witness it develop worldwide recognition for countless years.

  15. Hi, Neat post. There’s a problem together with your website in web explorer, may test this? IE nonetheless is the market leader and a big part of people will miss your excellent writing because of this problem.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s