# the Air Vent

## Because the world needs another opinion

• JeffId1 at gmail dot com

Everything you need to know about Climategate. Now available at Amazon.com

• ## Follow Blog via Email

Join 172 other followers

## Mann 07 Part 4 – Actual Proxy Autocorrelation

Posted by Jeff Id on August 14, 2010

I hope this is the last post for a while on Mann07 variance loss.  The difference between this post and my originals is simply the use of pseudoproxies  created from models plus noise rather than just using a single straight line.  Mann provided 100 pseudoproxy curves on his website from his 07 work, by adding noise he demonstrated that there was no significant variance loss in the CPS method.  Unfortunately, he couldn’t seem to justify an autocorrelation rho greater than 0.32, wheras I found an autocorrelation range which included values both lower and higher than his.

As we’ll see,  this does make a difference.

For this post, I created 10000 pseudoproxies with 75% noise and 25% signal.   The noise has the same autocorrelation histogram as actual proxies taken from Mann08 – shown above.  We’ve discussed CPS here so many times I’ll just present the result.

In this example having 25% noise, I used a correlation threshold of  0.4 which retained 51 percent of the noisy proxies, throwing the rest away.   If you’re new to CPS that’s what it does, it throws away data which doesn’t match temperature – basically a sophisticated way to get rid of data which  doesn’t do what you want. The result, is variance loss in the historic portion of the record.

There is an interesting clue we can take from this result.  Because we are using temperature-like curves from models, we can get an idea of the true signal to noise ratio in actual proxy data used.   In Mann08, they retain only 40 percent of the proxies with a correlation threshold of 0.1.    I’m retaining 50% with a s/n ratio of 25% at a much higher correlation threshold of 0.4 .

From Mann07

We adopted as our ‘‘standard’’ case SNR = 0.4 (86% noise, r = 0.37) which represents a signal-to-noise ratio than is either roughly equal to or lower than that estimated for actual proxy networks (e.g., the MXD or MBH98 proxy networks;  see Auxiliary Material, section 5), making it anappropriately conservative standard for  evaluating realworld proxy reconstructions.

We can see from this little bit of informatino that a SNR of 0.4 is hardly conservative.

If I set a threshold of 0.13 using this data with a SNR of 0.25 I retain 96% of proxies and produce a plot with almost zero variance loss — which you would expect because we are using basically all the data.

Now I may go back next week and adjust the S/N until we get to a 40 percent retention but now I will be away from blogland until Monday having fun.   The code for this was modified from the other CPS posts and needs cleaning up of the comments for presenting here, that will also have to wait unfortunately.  You’ve seen it enough times anyway, click the hockey stick posts link above for an example.

The main point is though, that if we have any higher autocorrelation proxies in the mix, these proxies are scaled in the CPS method to reduce the historic signal.  This happens because higher autocorrelation of proxies have artificial noise trends that take over the correlation value and create the de-weighting effect we can see in the historic signal. The reason M07 had no trouble with variance loss – proving the ‘robustness of reconstructions’ was that he didn’t use a high enough S/N and his methods used proxies with too low an autocorrelation.  Mann 07 is therefore incorrect.

Darn, I just realized there has to be at least one more post.

as our ‘‘standard’’ case SNR = 0.4 (86% noise, r = 0.37)
which represents a signal-to-noise ratio than is either
roughly equal to or lower than that estimated for actual
proxy networks (e.g., the MXD or MBH98 proxy networks;
see Auxiliary Material, section 5), making it an
appropriately conservative standard for evaluating realworld
proxy reconstructions.

1. ### Jeff Idsaid

In the figure 2 example, we have about a signal amplitude about 68% of actual. In M08, the loss is certainly more severe than this.

2. ### DeWitt Paynesaid

Jeff,

I’m still planning on doing something similar with an ARFIMA (1,x,0) model. A lot of the proxies have a small AR coefficient (less than 0.1) and a difference factor x greater than 0.05 (the histogram of the difference factor is fairly flat from 0.05 to 0.5). The difference factor is related to the Hurst coefficient by x = H – 0.5. About half of the retained proxies (1850-1995 test) are in this category. It will be interesting to see if a high Hurst coefficient causes more or less variance loss than a high AR coefficient. The R package fArma has the tools to do this.

3. ### Jeff Idsaid

#2, It should be pretty easy to work with an ARFIMA model in these calcs. I think that anything which supports persistence in the data will create the same effects.

4. ### tonybsaid

Jeff

This detailed study of temp proxies-including that compiled by Mann-has been done by proper statisticians and is just out.

Amongst the many put downs in this exhaustive report is;

“Climate scientists have greatly underestimated the uncertainty of proxy based reconstructions and hence have been overconfident in their models.”

One of the authors-McShane-is a newly qualified Phd so presumably has been learning the very latest in statistical analysis theory.

Would be intersted to hear how this fits in with your own analysis.

tonyb

5. ### Geoff Sherringtonsaid

Stupid me, but I have never been able to comprehend the logic that says you can calibrate temperature proxies back 2,000 years based on a match over 100 years, but a match in which it is expressly stated that there is a new variable, namely, heating by man-made GHG. Surely this heating has to be removed from the recent math before you can reconstruct the last few millennia from proxies?

In the related world of analytical chemistry, the accuracy of analysis is affected by the goodness of fit between the composition of the calibrating standards and the unknown samples. You simply do not use standards that have known extra additives in them, do you DeWitt? Cross talk between impurities, suspected or not, is one of the largest and hardest problems in the calibration step on analysis – and the effects are seldom first order linear interferences.

6. ### Eric Andersonsaid

Good point, Geoff. There is an inherent inconsistency with the idea of calibrating to the “contaminated” [by man] time period. I think this is just glossed over in the simplistic “CO2 necessarily leads to warming” idea, in which man is seen as nothing more than a producer of CO2 without other influence.

7. ### Earle Williamssaid

#5 Geoff Sherrington,

The influence of anthropogenic GHG is removed already. We know that “… the data selection, collection, and processing performed by climate scientists meets the standards of their discipline.”

Trim off the last 45 inconvenient years of divergent data and BAM! you’ve got coherent signal baby. 🙂

Okay, putting the sarcasm aside. You highlight yet another inconsistency in the CAGW melange. If one were to follow your argument assiduously, it would be necessary to calibrate prior to the alleged period of GHG contamination.

8. ### Ryan Osaid

If trees are thermometers, they care not whether the temperature is due to man’s influence. They would simply respond to whatever temperature their environment produces. There would be no need to calibrate to any specific period. There is not something inherently different about temperature that is the result of increasing GHG concentrations and temperature that is not.

If trees are not thermometers, then what you need to do is remove Mann’s influence . . . which starts in 1998.

9. ### curioussaid

Ryan, Jeff – any news on your Antarctic paper and publication?

10. ### Ryan Osaid

It hasn’t been rejected yet. 😉

11. ### curioussaid

Ok – thanks. Good luck 🙂

12. ### TimGsaid

Jeff,

The warmers constructing their attack of MW2010. One of their talking points is MW did not ‘follow the widely accepted practice’ of testing their methods against psuedo proxies and a GCM (see deepclimate for more). Based on what you have done the counter argument appears to be is warmist testing against pseudo proxies is shell game because they pick psuedo proxies that have no statistical resemblance and “coincidently” fail to expose the flaws in the methods that can be easily shown using more realistic proxies. Is my interpretation/summary reasonable?

13. ### SNR Estimates of M08 Temperature Proxy Data « the Air Ventsaid

[…] Mann 07 Part 4 – Actual Proxy Autocorrelation […]

14. ### Signal to Noise Ratio Estimates of Mann08 Temperature Proxy Data « Climate Auditsaid

[…] In demonstrating the effect in the recent Mann07 post at the Air Vent, we used model temperature data and added AR1 noise which matched the parameters from the 1209 proxies in Mann08. From that work we discovered that the percent proxies retained using CPS and a correlation threshold of r=0.1 results in a very high acceptance rate of the proxies 87% passed screening  – even though they had a signal to noise amplitude of 0.25.  This is significant because even 40% was considered a conservatively noisy proxy in M07 […]