# the Air Vent

## Because the world needs another opinion

• JeffId1 at gmail dot com

Everything you need to know about Climategate. Now available at Amazon.com

 Catherine Dunshea on Triplets on the Hudson ri… Frank on Easy Call Frank on Easy Call Frank on Easy Call Jeff Id on Easy Call Jeff Id on Easy Call Jeff Id on Easy Call Jeff Id on Easy Call Jeff Id on Easy Call Hunter on Easy Call Frank on Easy Call Frank on Easy Call Frank on Easy Call Frank on Easy Call gallopingcamel on Easy Call
• ## Follow Blog via Email

Join 169 other followers

## Yamal- The Dirty Dozen

Posted by Jeff Id on October 5, 2009

We’ve all been looking at the Yamal (Steve Mosher named)treemometer ring width data. Yamal is a tree ring series with a huge hockey stick blade used in and likely to be highly influential in a lot of serious studies which demonstrate the unprecedentedness of recent temperatures. When Steve McIntyre replaced 12 of the hockey stick creating proxies (the dirty dozen) with equally valid schweingruber proxies the blade of the hockey stick disappears along with the unprecedentedness. Of course the boys at “Real” Climate made a big stink about it but notably missed any specific criticism of the methods or data chosen. However, in this post I looked at the methods of RCS standardization and the effects it has on the series.

RCS is a method for correcting the ring widths of trees based on the age. For instance, we would expect that a young tree would have thicker rings at its core. As it grew thinner rings would form with diameter and eventually the diameter would become basically a non-factor. So dendro guys figure that they should fit an exponential decay function to tree rings and that will give a basic correction factor for total width.

The function is of the form

correction = A + B * e ^ -(C * age)

Don’t worry too much if that’s not a familiar equation to you, you’ll figure it out either way. I’m sure it is to many of my readers though. This equation is fit to 100% of the data simultaneously in Yamal. The assumption is:

1 – All the trees in the same conditions grow at the same rates.

2 – Fitting the equation to all trees together will average out over all the variable climates and despite different conditions, we can achieve a similar result to #1.

There is a problem with these assumptions which occurs at the endpoints of dendroclimatology reconstructions in that the most recent trees are still alive and therefore on average going to have a skewed age – either younger because they are equal and haven’t died or older (as is the case here) because trees that are long lived are easier to find than partially fossilized mud bound trees.

What it means is that the assumption fails for trees existing in vastly different conditions.

Now what I wanted to see was first, what is the correction factor per year used and second how different is the correction factor when fit to the dirty dozen. After all, tree age, location, group and other conditions affect trees dramatically. So my question was – If we had only the 12 Yamal trees, how would the correction factor look.

Figure 1 - Correction factors

The dozen trees fit to the same function have a very different result when run alone – these trees are different! Now climatology in general may be tempted to call this plot bunk because these trees are chosen in a timeframe where we should see hockey stick temperatures and we have too few samples. However, I would remind climatology that these were the same 12 trees used to create the MASSIVE blade on a 200 plus core study.

Well the next question I had was what would these 12 series look like wih RCS if the dirty dozen were used for the exponential fit alone.

Figure 2 - Blade Proxy RCS

The blade is dramatically reduced!! Just to show you how different Yamal is without the RCS correction factor calculated from the entire dataset, check out Figure 3.  The black line is the same 12 trees with the original correction factor, the red is the new also valid corrections.

Figure 3 - Yamal variants

Note the substantial drop in the peak tempeatures and the fact that pre-1900 is actually higher in temperature than 2000. Below is the zoomed in version.

Figure 4 - Yamal variants zoomed

Now don’t forget that despite the low count of trees for my RCS standardization, the “blade” years of the Yamal hockey stick only consist of those trees. These TWELVE trees were considered good enough to represent a huge portion of the planet in reconstructions. In the recent Arctic temperature reconstruction, Yamal was individually responsible for the temperature being unprecedented in the last 2000 years. Yamal represented 1/23 of the Arctic in that reconstruction yet it’s huge blade had an enormous influence on the outcome.

Now let’s talk about the green line. This line is RCS with NO Yad061 – well Yad is the leader of the dirty dozen. Meaner and taller than the rest. The green line in the above two plots represents what happens when Yad is removed – Ocean’s eleven?? A single tree with a huge influence on the hockey stick. Now again the green line is normalized to the original 12 series.

Honestly, the RCS methodology makes some sense, however the blanket application of the same correction factor across different species and locations of trees is hard to swallow. I mean, the assumptions simply don’t allow for different species of trees to grow at different rates in different soil and weather conditions. That’s not a particularly comforting assumption and engineering wise, it requires validation.

If you take anything away from this think about this question — What do these plots mean?

Actually, it’s quite simple and is shown in Figure 1.

They change in growth rate of the dozen trees comprising the blade of the influential Yamal hockey stick are NOT the same as the mean of the trees in the rest of the series.

We don’t know if the difference is due to species, location or some other factor, however these trees have vastly different ratio”s in growth rate from young to old. Briffa attributed it to temperature and it was eagerly accepted by climatology, however the lack of tree count in the important portion is a huge embarassment.

As in the Antarctic reconstruciton, I like to refer to the simple methods as a sanity check. We know that the assumption that trees have wider rings and faster growth in higher temperatures is for the most part reasonable. The same is true for sunlight, moisture, nutrients and CO2. However, exraordinary differences in growth rate such as the Briffa Yamal series claim, require extraordinary evidence.

We would expect in a large sampling of data that the older and younger rings would balance out. The following figure is a mean of tree ring widths for the series. It should be relatively difficult for climatology to explain away the average of the Yamal series.

Figure 5 - Yamal RCS and Mean

The mean just doesn’t have the same impact does it?

It is my contention that the Mean is equally as valid (and perhaps more so) a scribble as Briffa’s Yamal. All tree ring widths are accounted for equally, tree counts per year are taken into account and the result doesn’t match the RCS version – at all.

Trees make lousy thermometers.

1. ### Jeff Idsaid

The code for the post above. It’s a bit sloppy but it gets the job done.

########################
#############################
source(“http://www.climateaudit.org/scripts/utilities.txt”) #
source(“http://www.climateaudit.org/scripts/tree/utilities.treering.txt”)
#utility smooth function

#Hantemirov at NCDC
loc=”ftp://ftp.ncdc.noaa.gov/pub/data/paleo/treering/reconstructions/asia/russia/yamal_2002.txt”
dim(hant) #4064 4
hant=window( ts(hant[,2:4],start=hant[1,1]),start= -202 ) #minimum in measurement data

#Briffa Chronology from CRU
loc=”http://www.cru.uea.ac.uk/cru/people/melvin/PhilTrans2008/Column.prn”
name0=scan(loc,n=8,what=””)
name0=outer(name0,c(“”,”count”),function(x,y) paste(x,y,sep=”.”) )
n=nchar(name0[,1])
name0[,1]=substr(name0[,1],1,n-1)
names(briffa)=c(“year”, c(t(name0) ) )
briffa[briffa== -9999]=NA
briffa=briffa[,c(“year”,”Yamal.RCS”,”Yamal.RCS.count”)]
briffa=briffa[!is.na(briffa[,2]),]
briffa=window(ts(briffa[,2:3],start=briffa[1,1]),start=-202)
yamal.crn=briffa[,1]/1000

#Yamal measurement data
tree=make.rwl_new(“temp.dat”)
tree\$id=factor(tree\$id) #252
tree=agef(tree)
#save(tree,file=”d:/climate/data/yamal/yamal_cru.rwl.tab”)

range(tree\$year) #202 1996
yamal=tree
dim(yamal) # [1] 40892 4
yamal\$rw=yamal\$rw/10 # Sep 28
mean(yamal\$rw,na.rm=T) # 61.52668

#################
## INFO COLLATION
#########################

Info=data.frame(id=as.character(levels(tree\$id)) )
Info\$start= tapply(tree\$year,tree\$id,min)
Info\$end= tapply(tree\$year,tree\$id,max)
count=tapply(!is.na(tree\$rw),tree\$year,sum)
Info\$max= Info\$end-Info\$start+1
Info\$id=as.character(Info\$id)
n=nchar(Info\$id);temp=n>3
Info\$core=””; Info\$core[temp]=substr(Info\$id[temp],n[temp],n[temp]);
Info\$test=Info\$id;
Info\$test[temp]=substr(Info\$id[temp],1,n[temp]-1)
Info\$id=gsub(“_”,”L”,Info\$id) #guess
x=substr(Info\$id,1,3)
Info\$site=NA; Info\$site[temp]=x[temp]
Info\$site[!temp]=substr(Info\$id[!temp],1,1);
Info\$n=1;Info\$n[temp]=3
Info\$tree=as.numeric(substr(Info\$test,Info\$n+1,nchar(Info\$test)))
#Info[order(Info\$n,Info\$tree),]
# Info[order(Info\$n,Info\$end),]
(index3= Info\$id[Info\$n==3]) #this identified the 12 trees from 1988 on

################
# ANALYSIS 1: COMPARE COUNTS: http://www.climateaudit.org/?p=7142
######################

#1. clip the count image from Hantemirov 2002 CA/pdf/tree/hantemirov.2002.holocene.pdf
#2. Compare three versions

#plot count in CRU archive
count.yamal=countf(yamal)
year=c(time(count.yamal));N=length(count.yamal)
par(mar=c(3,3,2,1))
plot(year,count.yamal,col=”grey80″,type=”l”,ylim=c(0,45),yaxs=”i”)
polygon(xy.coords(x=c( year,rev(year)),y=c(count.yamal,rep(0,N))),col=”grey80″,border=1)
title(“Yamal Count: CRU Archive”)

#plot count in H and S version
par(mar=c(3,3,2,1))
year=c(time(hant));N=nrow(hant)
plot(c(time(hant)),hant[,”Samples”],col=”grey80″,type=”l”,ylim=c(0,45),yaxs=”i”)
polygon(xy.coords(x=c( year,rev(year)),y=c(hant[,”Samples”],rep(0,N))),col=”grey80″,border=1)
title(“Yamal Count: Hantemirov NCDC”)

#clip other illustration from HAntemirov and Shiyatov 2002

################
# ANALYSIS 2: COMPARE CHRONOLOGIES
######################
chron=ts.union(hant=hant[,2],briffa=yamal.crn)
Yamal=data.frame(year=c(time(chron)),chron)
fm=lm(briffa~hant,data=Yamal[Yamal\$year<1800,]);summary(fm)
# Multiple R-squared: 0.8164
#correlation 0.908

#################
## 3. COMPARE COUNT TO URALS
###################
dim(tree) #157
count.urals=countf(tree)

#plot comparison
plot(c(time(count.urals)),count.urals,type="l",xlab="",ylab="",xlim=c(600,2005))
lines(c(time(count.yamal)),count.yamal,col=2,lty=3)
legend("topleft",fill=2:1,legend=c("Yamal","Polar Urals"))
title("Core Counts")

###################
##4. SCHWEINGRUBER russ035 IMPACT
####################

#calculate base case RCS emulation using CRU archive of Yamal
chron.yamal=RCS.chronology(yamal,method="nls")
#takes a little time

#show that emulation of CRU archived chronology is accurate
par(mar=c(3,3,2,1))
delta=mean(yamal.crn)-mean(chron.yamal\$series);delta # -0.04820995
ts.plot(f(yamal.crn))
lines(f(chron.yamal\$series)+delta,col=2)
legend("topleft",fill=1:2,legend=c("Archived","Emulated") )
title("Yamal RCS Chronology (CRU)")
#close match

# collated from ftp://ftp.ncdc.noaa.gov/pub/data/paleo/treering/measurements/asia/russ035w.rwl
russ035=tree
info.russ035=data.frame(id=as.character(levels(russ035\$id))) #34 cores
mean(russ035\$rw,na.rm=T) # 70.1419

##############################################
##############################################
#RCS function from SteveM modified to return yamal correction
tree=yamal
method="nls"
dimnames(tree)[[2]][4]<-"x" #sometimes names are "rw", sometimes "mxd"
tree<-tree[!is.na(tree\$x),]
tree\$id<-factor(tree\$id)

if(method=="nls") { #preferred method
fm <- nls(x ~ A+B*exp(-C*age),data = tree,
start = list( A=mean(tree\$x,na.rm=T)/4,B = mean(tree\$x,na.rm=T), C= .01 ),
alg = "default", trace = TRUE,control=nls.control(maxiter=200, tol=1e-05, minFactor=1e-10));
B<-coef(fm);
fitted<- function(x) B[1]+B[2]*exp(-B[3]*x )
}

corYamal=fitted(1:300)
#RCS function from SteveM modified to return dirty dozen correction

tree=yamal[temp,]
method="nls"
dimnames(tree)[[2]][4]<-"x" #sometimes names are "rw", sometimes "mxd"
tree<-tree[!is.na(tree\$x),]
tree\$id<-factor(tree\$id)
if(method=="nls") { #preferred method
fm <- nls(x ~ A+B*exp(-C*age),data = tree,
start = list( A=mean(tree\$x,na.rm=T)/4,B = mean(tree\$x,na.rm=T), C= .01 ),
alg = "default", trace = TRUE,control=nls.control(maxiter=200, tol=1e-05, minFactor=1e-10));
B<-coef(fm);
fitted<- function(x) B[1]+B[2]*exp(-B[3]*x )
}

cordirdoz=fitted(1:300)

plot(corYamal,main="RCS Correction Factor for Yamal Set\n Entire Set vs the Dozen", xlab="Year",ylab="Correction Factor",type="l")
lines(cordirdoz,col=2)
#savePlot(paste("c:/agw/Yamal correction factors.jpg"),type="jpg")

plot(f(chron.var1\$series),col=2,lwd=2,main="RCS on Blade Proxy's Only",ylab="Alleged Temperature – Unscaled",xlab="Year")
#savePlot(paste("c:/agw/RCS dozen.jpg"),type="jpg")

#make dataset with picks only
temp=!is.na(match(yamal\$id,index3)) #the 12 picked series
tree=yamal[temp,]#rbind(yamal[!temp,],russ035)
length(unique(tree\$id)) # [1] 274
#calculate RCS chronology from 12 hs proxies
chron.var1=RCS.chronology(tree,method="nls")

tree2=yamal
chron.yamal=RCS.chronology(tree2,method="nls")

tree\$delta<- tree\$x/tree\$smooth
s2<-ts(series,start=min(tree\$year))
plot(f(chron.yamal\$series),ylim=c(0,2.8),main="Yamal Source Data",ylab="Alleged Temperature – Unscaled",xlab="Year")
lines(f(chron.var1\$series),col=2,lwd=2)
lines(f(s2),col=3,lwd=1)
legend("topleft",fill=1:3, legend=c("From Archive","Yamal 12 RCS Standardized Individually","No Yad061"))
#savePlot(paste("c:/agw/Yamal with dozen overlay.jpg"),type="jpg")

plot(f(chron.yamal\$series),ylim=c(0,2.8),main="Yamal Source Data",xlab="Alleged Temperature – Unscaled",ylab="Year",xlim=c(1000,2000))
lines(f(chron.var1\$series),col=2,lwd=2)
lines(f(s2),col=3,lwd=1)
legend("topleft",fill=1:3, legend=c("From Archive","Yamal 12 RCS Standardized Individually","No Yad061"))
#savePlot(paste("c:/agw/Yamal with dozen overlay zoom.jpg"),type="jpg")

tree=yamal
mn=mean.chronology(tree,method="nls")
plot(f(mn\$series)/sd(mn\$series),main="Yamal Mean VS RCS",xlab="Year", ylab="Whatever",ylim=c(0,6))
lines(f(chron.yamal\$series/sd(chron.yamal\$series)),col="red")
legend("bottom",fill=1:2, legend=c("Mean of Yamal","Yamal RCS Emulation"))
#savePlot(paste("c:/agw/Yamal with mean.jpg"),type="jpg")

2. ### kuhnkatsaid

Jeff, Jeff, Jeff,

you just don’t understand DendroClimatology. Some trees make good thermometers. Therefore the trick is to identify those special trees. Now, who is going to know better how to identify a thermometer tree than a Bona Fide Tree Ring DendroClimatologist??

This fact is proven by how Briffa can take other people’s data and other peoples tools and identify a small group of tree cores that match well to a Modern Instrumental Global Average Temperature Hockeystick that is unrepresentative of especially the region where these trees grew!!!

If only more of us were this capable in our own fields Obama COULD save the World!!

PS:

On a serious note, I wonder whether this:

http://www.co2science.org/articles/V12/N2/EDIT.php

will ever be factored into tree ring “science” and RCS.

3. ### LeeWsaid

Jeff,

A bit OT but a point that you have made at least a few times and I never have seen it given serious attention, but in my mind, seems to be an extremely important topic for further discussion.

Having read Dr. Bouldin’s posts in CA I noticed a certain hubris in his responses, most significantly to Roman, that they are absolutely sure of their results. Well, this got me thinking a bit. I believe it may be entirely possible to pull some sort of climatic information from tree rings, but I find it highly improbable that any scientist could find historical temperatures within a 3C range let alone tenths or hundredths of a degree. And if this holds true for trees, it only makes sense that it would hold true of almost all paleo reconstructions regardless of the media.

This isn’t to say that other valuable information cannot be obtained, but the specificity that temperatures are ‘reconstructed’ does not seem very plausible to me.

Just curious what you, or some of the other learned people here feel about this issue, and if there is any way to seriously explore it further?

Thanks,

Lee

4. ### Tony Hansensaid

From figure 5 – Yamal Mean is circa 5 ‘whatever’ in 400 BC and Yamal RCS Emulation is 6 ‘whatever’ in 2000. I was sure it was significant – but whatever the reason was escapes me for the moment.
Wasn’t Jean S using median instead of mean for something?
Ahh, whatever….

5. ### Layman Lurkersaid

#4

I think that Jean S was discussing the possibility of identifying an outlier in the “dirty dozen” by comparing mean and median values. In a normal distribution with many samples the mean and median would would be very close statistically. In a thinner sample population the relationship between the mean and median could get very wonky if there was a significant outlier. Tom P. tried to suggest that it would be “cherry picking” if a suspected outlier was removed. However, if an outlier is identified with appropriate statistical methods, it should be removed. After all, we want a sample population which is representative to draw inferences.

6. ### Tony Hansensaid

Thanks Layman,
How would one go about determing if the difference between mean and median was significant?
For a while I thought the difference might tell us something about the shape of the data. But that does not need to be so,does it?
If it is far from normal distribution then could it be any shape at all? And how far does far have to be?
Sorry – far too many questions and far too little understanding on my part.

7. ### Kondealersaid

The assumption is:

1 – All the trees in the same conditions grow at the same rates.

2 – Fitting the equation to all trees together will average out over all the variable climates and despite different conditions, we can achieve a similar result to #1.

And these assumptions are pretty big.

Leaving that aside though it would appear to me (as a plant physiologist rather than a statistician) that the general function;
correction = A + B * e ^ -(C * age)
could be applied in a highly specific manner- at the individual tree level, by using the well-established and entirely non controversial technique of dendrochronology. This method would allow trees to be cross-matched by age so fossil trees could be compared directly with living by means of their tree-ring pattern overlaps.
I believe it would then be possible to check whether or not the same RCS function values were equally valid for trees growing at the same location, but 100’s of years apart.
Any RCS curve using fixed variables (“one size fits all”) would need IMHO to show “robustness” of these function variables over the entire timescale of the reconstruction.

Speaking of reconstructions have you seen this little gem?

In “The Times” today and online at;
http://www.timesonline.co.uk/tol/news/science/article6862384.ece

“Explorer’s logbooks prove a welcome bounty for climate change doubters”

8. ### Layman Lurkersaid

#7

Adequate metadata of course would help to confirm or revise your assumptions. Briffa himself has noted the potential issues with RCS processing, along with the systematic issues relating to data endpoints.

#6

Not familiar with anything specific. However, the sample population statistics should be relatively insensitive to the removal of a single data point. In the end, the conclusion may just be that the sample size is too small to have any potential meaning.

9. ### Nic Lsaid

Good work, Jeff
I don’t know much about dendroclimatology reconstructions, but the big difference in the RCS correction factor between the 12 hockey stick blade trees and the full sample that you have identified looks to me like a serious issue. The effect of adjusting for it is pretty large.

10. ### David Jaysaid

RE: Figure 5

I do note that publications on RCS do not recommend using the technique with small sample sizes.

The small sample size appears to correspond to the period where RCS deviates so dramatically from the median (note beginning of the series in addition to the “dirty dozen”).

11. ### Leonard Weinsteinsaid

The fact that the CO2 was rising sharply in the last 150 years along with the temperature, and the fact that CO2 is known to boost tree growth, makes the cause of the recent growth rate less clear. Even if the LOCAL temperature history were known and correlated with the ring data, it would not show that CO2 was not actually the major cause of growth. This on top of the small sample size and mixing of tree ages makes the whole issue very suspect of being even useful. However the LOCAL temperature history is not known well enough to assure a correct calibration. The data, as it stands, is not very useful.

12. ### Kenneth Fritschsaid

Jeff ID, I find your analysis informative, but overall I find the analysis done at CA rather incomplete and disjointed.

I admit I might be missing some detail, but if tree age is a factor, and the literature and sensitivity all point in that
direction, why is not a more complete age sensitivty test in the vain that Tom P initiated being presented? Tom P’s test results indicate that Yamal is not robust and because of differences in tree age response to climate for which the RCS standardization methodology does not compensate. If the series shape, in the modern time and distant past, is sensitive to tree age at some tree age, I would think we could conclude that either tree age be better compensated for or only older trees, where climate response has been shown to be more uniform, be used.

I personally see too much bandwith wasted on reporting what RC is doing and replying to the likes of Lorax and Tom P. We will learn nothing from those conversations.

I intend to look into a tree age sensitivty test when I get back to my computer and in the meantime ignore some of these “filler” posts.

13. ### Jeff Idsaid

#12 I think Steve hates the weak threads too. His rate of commenting is about zero. Blogland is an odd place though, I can spend 30 minutes writing up a complaint about RC and get two thousand reads or spend 8 hours on sea ice data and get half of that. Not that it matters because I tend to write whatever has my interest. It’s just interesting to see.

On the tree age problem, it is a well known issue in paleo so now I’m not surprised that Briffa was reluctant to release the data behind the study. It simply doesn’t support the result which after this post appears to be an artifact of the method.

This post has 800 views so far.

14. ### Jeff Idsaid

#11, You can just PCA the signal out. It sorts the CO2 from the Temp and Moisture – or some such nonsense.

15. ### Mark Tsaid

It also puts a little post-it note on each component, clearly identifying each source.

Even basic discussions of PCA immediately give a thorough treatement of the problem of correlated sources. Yet, somehow, we are to believe these idiots that haven’t read any of the basic texts are the authorities and Steve McIntyre is naught but an amateur.

Mark

16. ### conardsaid

#13 Jeff,

If you have not take the time to look at Dr. Melvin’s work it may be well worth some of your time; chapter 5 of his thesis if you are short on time.

17. ### hmmmsaid

If they want to claim that this correction is valid they should at least provide its statistical limits (how much error/uncertainty does this assumption introduce?). It looks like the correction is so significant compared to the signal itself! Could you do a plot showing how much of the result (by percentage) is from the correction?

18. ### MikeNsaid

Kenneth, if you’re going to work on the code, then try adjusting TomP’s test to taking the average at every point in time, for trees that are above a certain age.
Tom’s test just takes the average of trees that live to a certain age, so in current period, you have all old trees, but in old periods you still have a mix of old and young trees.

19. ### Kenneth Fritschsaid

MikeN, the approach I have recommended at CA is exactly along the lines that you suggest. I think that Tom P’s approach is ok for tree age sensitivity, but that his detail and conclusions are all wrong. On other hand, I feel the criticism of Tom’s tests have been off point.

When I have access to my computer again I plan to do just what you suggest.

20. ### EJsaid

Gotta love your engineering perspective on all these issues.

Keep up the great work.

Thanks

21. ### Layman Lurkersaid

delayed.oscillator has posted an article commenting on this post.

http://delayedoscillator.wordpress.com/2009/10/17/yamal-iv-growth-curves-and-sample-size/

22. ### Yamal IV: Growth Curves and Sample Size « delayed.oscillatorsaid

[…] 2 comments Via Deep Climate, I found this post by Jeff Id at The Air Vent. Comments there and elsewhere lead me to believe there is some confusion about the […]