the Air Vent

Because the world needs another opinion

Gridded GHCN Temps R2.5

Posted by Jeff Id on March 2, 2010

In recent months, several bloggers have worked out their own gridded temperature series.  I’ve made several attempts, gradually improving each one.  This particular version isn’t yet perfect but it’s getting to a point where the quality isn’t far off from what the pro’s do, not that there aren’t errors lurking inside.  Again, this uses GHCN and Roman’s seasonal offsetting method.  I’ve commented the code more thoroughly and checked it for hours.  There were several small errors in the last version including weighting and gridding problems. This is likely what resulted in a too high trend and atypical graph shape.  It’s blogland tho and that’s what happens when you do your work online.

One important change in this version was the sorting of temperature series accoring to the inventory file.  If the file says the curves were from the same station, the algorithm collects them and takes the mean.   If the series are different instruments, the data is combined using Roman’s seasonal offset method.  This resulted in an improved match to pro curves over the old versions, which attacked the problem of data quality control from a variety of directions.

I’ll show the curves and results and then paste the complete R code so that anyone with the wish can run it.  After this version, it’s time to start messing around with different sorting methods, datasets, regions and those sorts of things.

Temperature since 1900 using seasonal offset matching

Temperature trend since 1978 using seasonal offset matching. Trend is 0.186 C/Decade

It’s great to see the trend drop into the expected range for ground station data.  As usual this incorporates all the problems in typical ground stations including urban warming, time of observation, instrument change, poor siting and the rest.

Global temperature trends since 1900

Global temperature trends since 1978

Finally, a filtered plot since 1978 with UAH overlaid in red.

UAH in red overlaid on gridded GHCN

I’m pleased with this improved version of the gridded ground temperatures.  The trend is in line with the expected values and the short term variance is matching UAH.  Of course the UAH satellite trend is substantially less than ground,  a difference that advocates blame on faults in instruments and science.  IMO it’s more likely that the difference in trend is due to the well known yet poorly characterized urban warming effects.  In particular the ground data has substantially higher trend since 2002 than UAH.  It was during this timeperiod that UAH started using a station keeping sat, which means that the same area’s of the earth are measured at the same times every day and FAR less corrections are necessary to the UAH dataset.

Anyway, the code is in full below.  Plot routines and all.  All  you need to do to run this is download R, GHCN and copy this code to the clipboard and paste in a script an you can run it yourself.

There is one more imperfection in the trend calculating code which I’m already aware of.  I believe it’s insignificant but let’s see if anyone finds it.

If you’re wondering what happens to my extra time, this should pretty well resolve it.

###############################################
## roman M functions
###############################################
##### Functions for calculating offsets and
#  and combining station series

###function for calculating matrix pseudo inverse for any r x c size

psx.inv = function(mat,tol = NULL)
{
	#  if (is.null(dim(mat))) return( mat /sum(mat^2))
 	if (NCOL(mat)==1) return( mat /sum(mat^2))
	msvd = svd(mat)
  	dind = msvd$d
	if (is.null(tol)) {tol = max(NROW(mat),NCOL(mat))*max(dind)*.Machine$double.eps}
  	dind[dind<tol]=0
  	dind[dind>0] = 1/dind[dind>0]
 	inv = msvd$v %*% diag(dind, length(dind)) %*% t(msvd$u)
	inv
}

### single offset calculation routine
# also used by monthly routine

calc.offset = function(tdat)
{
	delt.mat = !is.na(tdat)
  	delt.vec = rowSums(delt.mat)
  	co.mat = diag(colSums(delt.vec*delt.mat)) - (t(delt.mat)%*%delt.mat)
  	co.vec = colSums(delt.vec*delt.mat*tdat,na.rm=T) - colSums(rowSums(delt.mat*tdat,na.rm=T)*delt.mat)
  	offs = psx.inv(co.mat)%*%co.vec
  	c(offs)-mean(offs)
}

### generally assumes that tsdat starts in month 1 and is a time series.

monthly.offset =function(tsdat)
{
	dims = dim(tsdat)
  	off.mat = matrix(NA,12,dims[2])
  	dat.tsp = tsp(tsdat)
 	for (i in 1:12)
	{
    		dats = window(tsdat,start=c(dat.tsp[1],i), deltat=1)
    		offsets = calc.offset(dats)
    		off.mat[i,] = offsets
	}
  	colnames(off.mat) = colnames(tsdat)
  	rownames(off.mat) = month.abb
   	nr = dims[1]
   	nc = dims[2]
  	matoff = matrix(NA,nr,nc)
 	for (i in 1:nc) matoff[,i] = rep(off.mat[,i],length=nr)
 	temp = rowMeans(tsdat-matoff,na.rm=T)
  	pred =  c(temp) + matoff
  	residual = tsdat-pred
	list(temps = ts(temp,start=c(dat.tsp[1],1),freq=12),pred =pred, residual = residual, offsets=off.mat)
}

###
#Single offset
###

single.offset =function(tsdat)
{
  	offsets = calc.offset(tsdat)
  	names(offsets) = colnames(tsdat)
 	temp = colMeans(t(tsdat)-offsets,na.rm=T)
  	nr = length(temp)
  	pred =  temp + matrix(rep(offsets,each = nr),nrow=nr)
  	residual = tsdat-pred
	list(temps = ts(temp,start=c(tsp(tsdat)[1],1),freq=12),pred =pred, residual = residual, offsets=offsets)
}

#############################################
## Jeff Id stuff
#############################################

######## data load ########
loc="C:/agw/GHCN/v2.mean."
wd=c(3,5,3,1,4,5,5,5,5,5,5,5,5,5,5,5,5)
ghmean=read.fwf(loc,widths=wd,comment.char="")

loc="C:/agw/GHCN/v2.country.codes"
wd=c(3,38)
ccode=read.fwf(loc,widths=wd)

loc="C:/agw/GHCN/v2.temperature.inv"
wd=c(3,5,4,30,8,7,4,6,1,10,3,2,16,1)
#wd=100
tinv=read.fwf(loc,widths=wd,comment.char="")
stationnum=nrow(tinv) #7280

#save(ghmean, ccode, stationnum,tinv ,file="GHCN data.RData")
#load("ghcn data.RData")

###################################
############ Plotting functions #########
###################################
##
## very much of this section was written by RyanO, RomanM, NicL, SteveM and others.
## My part was to introduce the errors.
##

ff=function(x)
{
	filter.combine.pad(x,truncated.gauss.weights(31) )[,2]#31
}
fs=function(x)
{
	filter.combine.pad(x,truncated.gauss.weights(31) )[,2]#31
}

ssq=function(x) {sum(x^2)}

###  Set anchor colors (can use colors like "goldenrod", etc., as well, and
###  can add/subtract anchor points).  Must have at least 1 cold, 1 hot, and 1 mid,
###  and the # of cold & hot points must be the same (i.e., can't have 2 cold
###  and 5 hot anchor colors).

cold.end = rgb(red=50, green=0, blue=70, maxColorValue=255)
cold.2 = rgb(red=0, green=80, blue=220, maxColorValue=255)
cold.1 = rgb(red=0, green=210, blue=80, maxColorValue=255)
mid = rgb(red=248, green=248, blue=248, maxColorValue=255)
hot.1 = rgb(red=255, green=215, blue=0, maxColorValue=255)
hot.2 = rgb(red=255, green=80, blue=0, maxColorValue=255)
hot.end = rgb(red=60, green=0, blue=0, maxColorValue=255)
anchor.colors = c(cold.end, cold.2, cold.1, mid, hot.1, hot.2, hot.end)

###  Used internally by plotMap() to interpolate between the anchor colors

setColors=function (cols, divs) {

	###  Ensure divs within limits
	if (divs < 0) {divs = 0}
	if (divs > 255) {divs = 255}

	###  Get number of anchor points
	num.pts = length(cols)

	###  Start the color map
	len = length(cols) - 1
	color.map = cols[1]

	###  Interpolate between anchors
	for (i in 1:len) {

		###  Convert to RGB
		col.st = col2rgb(cols[i])
		col.en = col2rgb(cols[i + 1])

		###  Find steps
		r = (col.en[1, 1] - col.st[1, 1]) / (divs + 1)
		g = (col.en[2, 1] - col.st[2, 1]) / (divs + 1)
		b = (col.en[3, 1] - col.st[3, 1]) / (divs + 1)

		###  Define # of steps
		ramp = seq(1:(divs + 1))

		###  Interpolate
		r.ramp = col.st[1, 1] + ramp * r
		g.ramp = col.st[2, 1] + ramp * g
		b.ramp = col.st[3, 1] + ramp * b

		###  Concatenate
		color.map = c(color.map, rgb(red = r.ramp, green = g.ramp, blue = b.ramp, maxColorValue = 255))
	}

	###  Return map
	color.map
}

###  Use to assign parameters for the map projection.  For example, to obtain
###  Antarctic plots, do the following:
###
###  ant.params = mapParams(orientation = c(-90, 0, 0), ylim = c(-90, -60), lat.lines = FALSE, lon.lines = FALSE, type = "points")
###
###  Then, when running plotMap() for any Antarctic projection, simply add "map.params = ant.params" in the plotMap() command.

mapParams = function (database = "world", boundary = TRUE, parameters = NULL, orientation = c(90, 0, 0), fill = FALSE, col = 1, xlim = c(-180, 180), ylim = c(-90, 90), wrap = TRUE, bg = "white", border = c(0.2, 0.2), lat.lines = TRUE, lat.start = -90, lat.end = 90, lat.inc = 10, lat.col = "darkgray", lat.lty = 3, lat.res = 1, lon.lines = TRUE, lon.start = -180, lon.end = 180, lon.inc = 45, lon.col = "darkgray", lon.lty = 3, lon.res = 1, type = "rect", pch = 15, cex = 1) {

	###  Assemble into list for plotMap()
	map.params = list()
	map.params[[1]] = database
	map.params[[2]] = boundary
	map.params[[3]] = parameters
	map.params[[4]] = orientation
	map.params[[5]] = fill
	map.params[[6]] = col
	map.params[[7]] = xlim
	map.params[[8]] = ylim
	map.params[[9]] = wrap
	map.params[[10]] = bg
	map.params[[11]] = border
	map.params[[12]] = lat.lines
	map.params[[13]] = c(lat.start, lat.end, lat.inc, lat.col, lat.lty, lat.res)
	map.params[[14]] = lon.lines
	map.params[[15]] = c(lon.start, lon.end, lon.inc, lon.col, lon.lty, lon.res)
	map.params[[16]] = type
	map.params[[17]] = pch
	map.params[[18]] = cex
	map.params
}

###  Use to make overlays.  For this, if you have a secondary set of points that you want to plot (like CIs or something),
###  you would do:
###
###  overlay = list()
###  overlay[[1]] = makeOverlay(data, coords, ...)
###
###  You can add multiple overlays, too.  So after making that one, you could do:
###
###  overlay[[2]] = makeOverlay(newdata, newcoords, ...)

makeOverlay = function (X, coord = coords, norm = FALSE, convert = FALSE, pch = 1, cex = 1, type = "rect", density = NULL, angle = 45) {

	###  Format for read by plotMap()
	overlay = list()
	overlay$data = X
	overlay$char = pch
	overlay$size = cex
	overlay$lat = coord[, 1]
	overlay$lon = coord[, 2]
	overlay$lat.span = coord[, 4]
	overlay$lon.span = coord[, 5]
	overlay$norm = norm
	overlay$convert = convert
	overlay$type = type
	overlay$density = density
	overlay$angle = angle
	overlay
}

###  To add in your map parameters and overlays:
###
###  plotMap( ... map.params = ant.params, overlays = TRUE, Y = overlay ...)

plotMap=function (X, coord = coords, rng = 1, divs=255, cols = anchor.colors, na.col = "gray", m.title = NA, units = NA, font.size = 0.8, project = "mollweide", map.params = NA, norm=FALSE, overlays = FALSE, Y = NA) {

	library(mapproj)

	###  Read coordinates
	lat = coord[, 1]
	lon = coord[, 2]
	lat.span = coord[, 4]
	lon.span = coord[, 5]

	###  If no user defined projection parameters, load defaults
	if (is.na(map.params[[1]])) {map.params = mapParams()}

	###  Assign projection parameters
	dbase = map.params[[1]]
	bound = map.params[[2]]
	p.params = map.params[[3]]
	orientate = map.params[[4]]
	poly.fill = map.params[[5]]
	poly.cols = map.params[[6]]
	xlims = map.params[[7]]
	ylims = map.params[[8]]
	wrapper = map.params[[9]]
	backgnd = map.params[[10]]
	borders = map.params[[11]]
	lat.lines = map.params[[12]]
	lat.params = map.params[[13]]
	lon.lines = map.params[[14]]
	lon.params = map.params[[15]]
	type = map.params[[16]]
	char = map.params[[17]]
	char.size = map.params[[18]]

	###  Split screen
	if (screen() != FALSE) {close.screen(all.screens = TRUE)}
	fig = matrix(nrow = 3, ncol = 4)
	fig[1, ] = c(0, 1, 0.85, 1)
	fig[2, ] = c(0, .85, 0, 0.85)
	fig[3, ] = c(.85, 1, 0, 0.85)
	split.screen(figs = fig)
	screen(2)

	###  Plot projection
	par(mar = c(0, 0, 0, 0) + 0.1)
	map(database = dbase, boundary = bound, projection = project, parameters = p.params, orientation = orientate, fill = poly.fill, col = poly.cols, xlim = xlims, ylim = ylims, wrap = wrapper, bg = backgnd, border = borders)

	###  Generate color scale
	map.cols = setColors(cols, divs)

	###  Convert data to color scale
	lims = c(-rng, rng)
	temp = assignColors(X, lims, map.cols, norm)
	cols = temp[[1]]
	na.map = temp[[2]]

	###  Plot
	if (type == "rect") {
		temp = mapRectPlot(map.cols[cols], lat, lon, lat.span, lon.span, dens = NULL, ang = 45)

		###  Plot NAs
		if (sum(na.map) > 0) {
			temp = mapRectPlot(rep(na.col, length(lat))[na.map], lat[na.map], lon[na.map], lat.span[na.map], lon.span[na.map], dens = NULL, ang = 45)
		}
	}

	###  Plot
	if (type == "points") {
		temp = mapPointPlot(map.cols[cols], lat, lon, char, char.size)

		###  Plot NAs
		if (sum(na.map) > 0) {
			temp = mapPointPlot(rep(na.col, length(lat))[na.map], lat[na.map], lon[na.map], char, char.size)
		}
	}

	###  Overlays?
	if (overlays == TRUE) {

		###  Find number of overlays
		num.overlays = length(Y)

		###  Iterate through each overlay
		for (i in 1:num.overlays) {

			###  Extract data
			overlay = Y[[i]]
			dat = overlay$data
			char = overlay$char
			size = overlay$size
			lat = overlay$lat
			lon = overlay$lon
			lat.span = overlay$lat.span
			lon.span = overlay$lon.span
			ovr.norm = overlay$norm = norm
			convert = overlay$convert
			type = overlay$type
			dens = overlay$density
			ang = overlay$angle
			over.cols=dat

			###  Convert?
			if (convert == TRUE) {
				converted = assignColors(dat, lims, map.cols, norm)
				cols = converted[[1]]
				over.cols=map.cols[cols]
				na.map = converted[[2]]
			}

			###  Plot overlay
			if (type == "rect") {
				temp = mapRectPlot(over.cols, lat, lon, lat.span, lon.span, dens, ang)
			}
			if (type == "points" ) {
				temp = mapPointPlot(over.cols, lat, lon, char, size)
			}
		}
	}

	###  Overlay map
	map(database = dbase, boundary = bound, projection = project, parameters = p.params, orientation = orientate, fill = poly.fill, col = poly.cols, xlim = xlims, ylim = ylims, wrap = wrapper, bg = backgnd, border = borders, add=TRUE)

	###  Lat lines?
	if (lat.lines == TRUE) {

		###  Get parameters
		start.lat = as.numeric(lat.params[1])
		end.lat = as.numeric(lat.params[2])
		inc.lat = as.numeric(lat.params[3])
		col.lat = lat.params[4]
		lty.lat = as.numeric(lat.params[5])
		res.lat = as.numeric(lat.params[6])

		###  Generate line segments
		lats = seq(start.lat, end.lat, inc.lat)
		x.seq = seq(-180, 180, res.lat)
		x.len = length(x.seq)

		###  Plot
		for (i in 1:length(lats)) {
			lines(mapproject(list(x = x.seq, y = rep(lats[i], x.len))), col = col.lat, lty = lty.lat)
		}
	}

	###  Lon lines?
	if (lon.lines == TRUE) {

		###  Get parameters
		start.lon = as.numeric(lon.params[1])
		end.lon = as.numeric(lon.params[2])
		inc.lon = as.numeric(lon.params[3])
		col.lon = lon.params[4]
		lty.lon = as.numeric(lon.params[5])
		res.lon = as.numeric(lon.params[6])

		###  Generate line segments
		lons = seq(start.lon, end.lon, inc.lon)
		y.seq = seq(-90, 90, res.lon)
		y.len = length(y.seq)

		###  Plot
		for (i in 1:length(lons)) {
			lines(mapproject(list(x = rep(lons[i], y.len), y = y.seq)), col = col.lon, lty = lty.lon)
		}
	}

	###  Plot color scale
	screen(3)
	par(mar = c(0, 0, 0, 0) + 0.1)
	rect(xleft = 0, xright = 1, ybottom = 0, ytop = 1, col = "white", border = 0)
	lines(x = c(0.45, 0.65), y = c(0.9, 0.9), col = 1)
	lines(x = c(0.45, 0.65), y = c(0.1, 0.1), col = 1)
	lines(x = c(0.4, 0.65), y = c(0.5, 0.5), col = 1)
	yinc = 0.8 / length(map.cols)
	yvals = 0.1 + seq(1:length(map.cols)) * yinc
	for (i in 1:length(map.cols)) {
		rect(xleft = 0.45, xright = 0.6, ytop = yvals[i], ybottom = yvals[i] - yinc, col = map.cols[i], border = NA)
	}
	text(x = 0.66, y = 0.5, round(0.5 * (lims[2] + lims[1]), 1), pos = 4, cex = font.size)
	text(x = 0.66, y = 0.9, round(lims[2], 1), pos = 4, cex = font.size)
	text(x = 0.66, y = 0.1, round(lims[1], 1), pos = 4, cex = font.size)
	par(srt = 90)
	text(x = 0.2, y = 0.5, units, cex=font.size)

	###  Plot title
	if (!is.na(m.title)) {
		par(mar = c(0, 0, 0, 0) + 0.1, cex.main = font.size)
		screen(1)
		title(main = m.title)
	}

	###  Remove split screens for subsequent plotting
	close.screen(all = TRUE)
}

###  Plotting function called by plotMap()

mapRectPlot = function (dat, lat, lon, lat.span, lon.span, dens, ang) {

	idx = sign(lat * lon) == -1
	xy1 = mapproject(lon - (0.5 * lon.span), lat + (0.5 * lat.span))
	xy2 = mapproject(lon + (0.5 * lon.span), lat - (0.5 * lat.span))
	xy3 = mapproject(lon - (0.5 * lon.span), lat - (0.5 * lat.span))
	xy4 = mapproject(lon + (0.5 * lon.span), lat + (0.5 * lat.span))
	xy1$x[idx] = xy3$x[idx]
	xy1$y[idx] = xy3$y[idx]
	xy2$x[idx] = xy4$x[idx]
	xy2$y[idx] = xy4$y[idx]
	rect(xleft = xy1$x, xright = xy2$x, ybottom = xy1$y, ytop = xy2$y, col = dat, border = NA, density = dens, angle = ang)
}

###  Plotting function called by plotMap()

mapPointPlot = function (dat, lat, lon, char, size) {

	xy = mapproject(lon, lat)
	points(x = xy$x, y = xy$y, col = dat, cex = size, pch = char)
}

###  Function called by plotMap() to convert data to color scale

assignColors = function (X, lims, map.cols, norm) {

	###  Determine scale limits
	steps = (1 + length(map.cols)) / 2
	inc = lims[2] / steps

	###  Scale data by maximum absolute value?
	dat = as.vector(X)
	na.map = is.na(dat)
	if (norm == TRUE ) {dat = dat / max(abs(dat))}

	###  Assign colors to data points
	cols = trunc(dat / inc)
	cols = cols + steps
	hi = cols > length(map.cols)
	cols[hi] = length(map.cols)
	lo = cols < 1
	cols[lo] = 1

	###  Return
	temp = list(cols, na.map)
}

gridSphere = function (lat.size, ref = 0, eq.area = TRUE)
{

	###  Initialize variables
	lat.divs = 90 / lat.size
	size = lat.size * pi / 180
	areas = vector()

	###  Find exact area for each annular swath for latitudinal divisions
	for (i in 1:lat.divs) {
		areas[i] = (2 * pi) * abs(cos(pi/2 - size * (i - 1)) - cos(pi/2 - size * i))
	}

	###  Find longitudinal divisions and weighting factors (equal area cells)
	raw.divs = areas / areas[1] * lat.divs * 4
	lon.divs = round(raw.divs)
	weight.factors = sqrt(1 - (raw.divs - lon.divs) / lon.divs)

	###  Find weighting factors if cells use equal lat/lon divisions
	if (eq.area == FALSE) {
		lon.divs = rep(lat.divs * 4, lat.divs)
		weight.factors = sqrt(areas / areas[1])
	}

	###  Format for lat/lon pairs
	lon.divs = c(rev(lon.divs), lon.divs)
	weight.factors = c(rev(weight.factors), weight.factors)

	###  Set up variables for lat/lon pairs
	points = sum(lon.divs)
	coords = array(NA, c(points, 5))
	temp.lons = vector()
	temp.lats = vector()
	temp.weights = vector()
	temp.lat.span = vector()
	temp.lon.span = vector()

	###  Make lat/lon pairs
	for (i in 1:(lat.divs * 2)) {

		###  Format lats/lons and spans for storage
		lons = (360 / lon.divs[i] * (seq(1:lon.divs[i])-1)) - 180
		lon.span = rep(360 / lon.divs[i], length(lons))
		lats = rep(-90 + (i - 1) * (90 / lat.divs), length(lons))
		lat.span = rep(lat.size, length(lons))

		###  Find reference lat/lons for cells
		###  0 = center
		###  1 = top left
		###  2 = top right
		###  3 = bottom right
		###  4 = bottom left (requires no offset; script
		###      builds using bottom left reference)

		if (ref == 0) {
			lons = lons + (0.5 * (lons[2] - lons[1]))
			lats = lats + (45 / lat.divs)
		}

		if (ref == 1) {
			lats = lats + (90 / lat.divs)
		}

		if (ref == 2) {
			lons = lons + (lons[2] - lons[1])
			lats = lats + (90 / lat.divs)
		}

		if (ref == 3) {
			lons = lons + (lons[2] - lons[1])
		}

		###  Collate
		temp.lats = c(temp.lats, lats)
		temp.lons = c(temp.lons, lons)
		temp.weights = c(temp.weights, rep(weight.factors[i], length(lons)))
		temp.lat.span = c(temp.lat.span, lat.span)
		temp.lon.span = c(temp.lon.span, lon.span)
	}

	###  Store and return
	coords[, 1] = temp.lats
	coords[, 2] = temp.lons
	coords[, 3] = temp.weights
	coords[, 4] = temp.lat.span
	coords[, 5] = temp.lon.span
	colnames(coords) = c("Latitude", "Longitude", "Area Weight", "Cell Span (Latitude)", "Cell Span (Longitude)")
	attr(coords, "Info") = "Area weights are relative to equatorial cells."
	coords
}

plt.avg=function(dat, st=1850, en=2011, y.pos=NA, x.pos=NA, main.t="Untitled") {

	###  Get trend
	fm=lm(window(dat, st, en)~I(time(window(dat, st, en))))

	###  Initialize variables
	N=length(window(dat, st, en))
	I=seq(1:N)/frequency(dat)
	rmI=ssq(I-mean(I))

	###  Calculate sum of squared errors
	SSE=ssq(fm[[2]])/(N-2)

	###  Calculate OLS standard errors
	seOLS=sqrt(SSE/rmI)*10

	###  Calculate two-tailed 95% CI
	ci95=seOLS*1.964

	###  Get lag-1 ACF
	resids=residuals(lm(window(dat, st, en)~seq(1:N)))
	r1=acf(resids, lag.max=1, plot=FALSE)$acf[[2]]

	###  Calculate CIs with Quenouille (Santer) adjustment for autocorrelation
	Q=(1-r1)/(1+r1)
	ciQ=ci95/sqrt(Q)

	###  Plot data
	plot(window(dat, st, en), main=main.t, ylab="Anomaly (Deg C)")

	###  Add trendline and x-axis
	abline(h=0, col=2); abline(fm, col=4, lwd=2)

	###  Annotate text
	text(x=ifelse(is.na(x.pos), st, x.pos), y=ifelse(is.na(y.pos), max(dat)-0.1*max(dat), y.pos), paste("Slope: ", round(fm$coef[2]*10, 4,"Deg C/Decade"), "+/-", round(ciQ, 4), "Deg C/Decade\n(corrected for AR(1) serial correlation)\nLag-1 value:", round(r1, 3)), cex=.8, col=4, pos=4)
}

#calculaate anomalies
calc.anom=function(dat)
{
	vec=FALSE; ts.conv=FALSE

	###  If a time series, make sure to restore after conversion to matrix
	if(is.ts(dat)) {st=start(dat); en=end(dat); fr=frequency(dat); ts.conv=TRUE}

	###  Allows anomalies to be calculated for vectors
	if(is.null(dim(dat))) {dat=as.matrix(dat, ncol=1); vec=TRUE; mean.f=mean} else {mean.f=colMeans}

	###  Allows anomalies to be calculated for non-time series, column 1 matrices
	if(ncol(dat)==1) {mean.f=mean}

	for(i in 1:12) {
		sequ=seq(i, nrow(dat), 12)
		dat[sequ, ]=scale(dat[sequ, ], scale=FALSE)
	}

	###  Restore original characteristics of input data
	if(vec==TRUE) {dat=as.vector(dat)}
	if(ts.conv==TRUE) {dat=ts(dat, start=st, end=en, freq=fr)}
	dat
}
############################################################
############### Jeff Id Stuff ##############################
############################################################
#combine staton data by insuring whether each series is in fact the same
getallcurves=function(staid=60360, type="A")
{
	allraw=NA
	#raw data
	mask= (ghmean[,2]==staid)
	data=ghmean[mask,]
	nostations=levels(factor((data[,3])))
	smask=TRUE
	if(type!="A")
	{
		smask= tinv[tinv[,2]==staid,9]==stationtype
	}

	for (i in nostations[smask])
	{
		mask=data[,3]==i
		noser= levels(factor(data[mask,4]))
		stadata=data[mask,]
		staraw=NA##init to non time series

		for(j in noser)
		{
			mask2=stadata[,4]==j
			startyear=min(stadata[mask2,5])
			endyear=max(stadata[mask2,5])
			subd=stadata[mask2,6:17]
			index=(stadata[mask2,5]-startyear)+1
			dat=array(NA,dim=c(12,(endyear-startyear)+1))
			for(k in 1:length(index))
			{
				dat[,index[k]]=as.numeric(subd[k,])
			}
			dim(dat)=c(length(dat),1)

			dat[dat==-9999]=NA
			dat=dat/10
			rawd=ts(dat,start=startyear,deltat=1/12)
			if(max(time(rawd))>=2011)
			{
				print ("error series")
				rawd=NA
			}

			if(!is.ts(staraw))
			{
				staraw=rawd
			}
			else
			{
				staraw=ts.union(staraw,rawd)
			}
		}
		if(!is.na(staraw))
		{
			if(!is.ts(allraw))
			{
				allraw=ts(rowMeans(staraw,na.rm=TRUE),start=time(staraw)[1],freq=12)
			}
			else
			{
				allraw=ts.union(allraw,ts(rowMeans(staraw,na.rm=TRUE),start=time(staraw)[1],freq=12))
			}
		}
	}
	allraw
}
##############################################################
############ Assemble stations into grid Jeff Id #############
##############################################################
sid=levels(factor((ghmean[,2])))  #get station Id's

##assign stations to gridcells
rawcoord=array(0, dim=c(length(sid),2) )
j=1
for (i in sid)
{
	mask=tinv[,2]==i
	rawcoord[j,1]=tinv[mask,5][1]#lat
	rawcoord[j,2]=tinv[mask,6][1]#lon
	j=j+1
}

gridval=array(NA,dim=c(length(sid),2))

#place stations in closest grid
for(i in 1:length(sid))
{
	gridval[i,1]=as.integer((rawcoord[i,1]+90)/5)+1  #lat
	gridval[i,2]=as.integer((rawcoord[i,2]+180)/5)+1  #lon
}

#calculate gridcell trends
gridtem=array(NA,dim=c(36,72,211*12))
for(ii in 1:72)
{
	for(jj in 1:36)
	{
		maska=gridval[,2]==ii
		maskb=gridval[,1]==jj
		maskc= maska & maskb
		sta=NA ##initialize to non TS
		for(i in sid[maskc])
		{
			#get all station trends for station ID
			rawsta = getallcurves(as.numeric(i),type="A")

			#place all curves side by side in a ts structure
			if(!is.ts(sta))
			{
				sta=rawsta
			}else{
				ts.union(sta,rawsta)
			}

		}

		if(is.ts(sta))#if at least one time series exists
		{
			sta=window(sta,start=1800)	#ignore pre-1800
			index=as.integer(((time(sta)-1800)*12+.002)+1)  #calculate time index for insertion into array
			if(!is.null(ncol(sta)))  #is station more than one column
			{
				gridtem[jj,ii,index]=calc.anom(monthly.offset(sta)$temps)
				#plot(sta[,1:10])

			}else{
				gridtem[jj,ii,index]=calc.anom(sta)
				#plot(sta)

			}
			print(jj)  #progress debug
			#plot(ts(gridtem[jj,ii,],start=1800,freq=12))  #debug
		}
	}
	print ("COLUMN")  #progress debug
	print(ii) 		#progress debug
}

glav=rep(NA,211*12)				#global ave
weight=sin(seq(2.5,180, 5)*(pi)/180)	#cell weights
meanweight=mean(weight)
weight=weight/meanweight			#normalize cell weights
#weight=rep(1,36) # calculate unweighted temp  #uncomment for unweighted version
mask=rep(FALSE,36)				#lon mask
mask[1:18]=TRUE					#lon mask hemisphere

for(kk in 1: (211*12))				#hemisphere 1
{
	mm=rep(NA,72)
	for(ii in 1:72)
	{
		mm[ii]=mean(gridtem[mask,ii,kk]*weight[1:18],na.rm=TRUE)
	}
	glav[kk]=mean(mm,na.rm=TRUE)
}
glava=glav						#hemisphere average

mask=!mask						#select other hemisphere
for(kk in 1: (211*12))
{
	mm=rep(NA,72)
	for(ii in 1:72)
	{
		mm[ii]=mean(gridtem[mask,ii,kk]*weight[19:36],na.rm=TRUE)
	}
	glav[kk]=mean(mm,na.rm=TRUE)
}

glavb=glav						#hemisphere average

glav=(glava+glavb)/2				#global average

plot((ts(glav,start=1800,deltat=1/12)),main="Gridded Global Temperature GHCN",xlab="Year",ylab="Anomaly Deg C",xlim=c(1900,2011),ylim=c(-2,2))
#savePlot("c:/agw/plots/global GHCN station gridded temperature roman 1900.jpg",type="jpg")
gv=ts(glav,start=1800,deltat=1/12)
plt.avg((window(gv,start=1978)),y.pos=.8,x.pos=1980,main.t="Gridded Global Temperature GHCN")
#savePlot("c:/agw/global GHCN station gridded temperature since 1978.jpg",type="jpg")
plot(ff(window(gv,start=1978)),y.pos=.8,x.pos=1980,main.t="Gridded Global Temperature GHCN")
lines(ff(UAH)+mean((window(gv,start=1978)),na.rm=TRUE)-mean(UAH,na.rm=TRUE),col=2)
#savePlot("c:/agw/global GHCN station gridded temperature since 1978 UAH.jpg",type="jpg")

###########################################
######## plot global trends ###############
###########################################
gtrend=array(NA, dim=(36*72))
for(jj in 1:36)
{
	for(ii in 1:72)
	{
		dat=window(ts(gridtem[jj,ii,],start=1800,deltat=1/12),start=1900)
		if(sum(!is.na(dat))>5)
		{

			gtrend[(jj-1)*72+ii]=coef(lsfit(time(dat),dat))[2]*10
		}
	}
}
## set up coordinates for plotting global trends
coorda=array(NA, dim=(36*72))
coordb=array(NA, dim=(36*72))

for(jj in 1:36)
{
	for(ii in 1:72)
	{
		coorda[(jj-1)*72+ii]=(ii-1)*5-180+2.5
		coordb[(jj-1)*72+ii]=(jj-1)*5-90+2.5
	}
}
coord=cbind(coordb,coorda,1,5,5)

plotMap(gtrend,coord,na.col=NULL,project="mollweide",m.title="GHCN Gridded Temp Trend Since 1900\nSeasonal Offset Matching",rng=c(1,-1))
#savePlot("c:/agw/plots/globe roman 1900.jpg",type="jpg")

54 Responses to “Gridded GHCN Temps R2.5”

  1. timetochooseagain said

    Did you use UAH over land or land and ocean?

  2. Jeff Id said

    #1 land and ocean.

  3. Tim said

    Can you clarify whether UAH plot includes oceans or is it just land?

  4. Tim said

    #2 so the two plots are not directly comparable?

  5. Andy said

    Funny how no one has made comment on Al Gore’s Op-ed in the NYTimes yesterday, not even to laugh at him.

  6. Jeff Id said

    #4 The plots are not of the same thing. UAH wouldn’t be comparable in the strictest sense even if it were masked for land only. The differences are substantial. However, the two data sets are measures of air temp so the short term variance as well as the trend are expected to be fairly close.

  7. Carrick said

    Jeff ID:

    The trend is in line with the expected values and the short term variance is matching UAH. Of course the UAH satellite trend is substantially less than ground, a difference that advocates blame on faults in instruments and science. IMO it’s more likely that the difference in trend is due to the well known yet poorly characterized urban warming effects

    As Tim points out, you need to use land-only UAH values in order to be able to compare them…the marine atmospheric boundary layer (ABL) is much more stable than the land ABL.

  8. Jeff Id said

    #7 I think you are too quick to assume that the difference is boundary layer induced.

    http://docs.google.com/File?id=dcjxztvr_164cc2kn7kg_b

    The layer thickness of UAH resists some of the sea boundary layer effects. If you look at the image, trends are reasonably independent of the surface.

    Of course a land only version would be better.

  9. Ruhroh said

    I think I know what’s wrong;

    You haven’t persuaded the weather bureau’s to do twice daily forecasts with your code.

    That’s how Slingo ensures that her gridded GHCN temp trend code is,

    dare I say it,,,

    roBUST!

    And those poor clueless MP’s were lapping it up…
    Yikes! Shameful misleading of the 3rd kind that Feynman warned against in the Cargo Cult Lecture.
    RR

  10. Ruhroh said

    BTW, when did software go technicolor?

    That really brings a new dementia to coding!

    Your code looks so attractive, I might actually be inspired to learn a new language.

    I was down in the basement of the Computer Center one night,
    35 years ago,
    counting the parentheses on my stinkin’ LISP “Program” when I had the epiphany,
    that I HATED software.

    But you make it look so nice, I’m tempted to give it a go once again…
    Thanks for the free tutorial.
    RR

  11. #10.

    HAHA counting parens in lisp. I hated Lisp. I could never finish a program in it to save my frickin life

  12. error due to spatial coverage? I wanna see romans take on Brohan 06

  13. Ausie Dan said

    Jeff – that’s very good work.
    but I have two comments.

    1. I have worked with the NCDC data set and have demonstrated that the global temperature (at least as shown by this valued added data set) consists of three components:
    (i) a long term linear trend.
    (ii) a cyclic term repeating each 65 years and more a series of linear zig-zags that an sine wave pattern.
    (iii) short term random, directionless, residual fluctiations.
    Since 1880, there has been two complete cycles, the last half cycle commencing in 1976. There is therefore little point in graphing 1980 onwards and calculating a rate of increase. You are merely capturing the latest (just finished) zig. THe slope of the longer term linear trend is far less steep.

    2. To make any sense of trends in temperature, you should be working on data from true rural sites only. I don’t have that data, but from what I have been reading, the slope of these is close to zero over the last 100 years.

    I suggest that it is pointless replicating the published work of CRU and others.
    What we need is a real surface data set with no UHI (not adjusted data to combine urban and rural). Just pure, unajusted, rural temperatures.

    When that is done, the next step is to re-examine the satelite data. I realise it is very accurate. I am begining to ask just what is it measuring so accurately? It does not seem to be surface temperature.

  14. steveta_uk said

    “BTW, when did software go technicolor?”

    About 15 years ago – around the same time terminals did.

  15. Chuckles said

    #10,

    Satellite temps are definitely precise, and we hope they’re accurate🙂

    The satellite data is used to produce a lower troposphere temperature. It is most definitely not a ground surface temp.

    http://www.drroyspencer.com/2010/01/how-the-uah-global-temperatures-are-produced/

  16. KevinUK said

    Ausie Dan,

    I’ve done a very similar analysis of both the NOAA GHCN and NASA GISS raw datasets (without having to resort to anomalising and gridding of the data) and have produced maps showing the warming and cooling trends for four different time periods (1880 to 1910, 1910 to 1949, 1940 to 1970 and 1970 to 2010 as well as for the full 1880 to 2010 period.

    As you have also found there is a clear linear trend (due to recovery from the LIA) with a multi-decadel cyclic superimposed upon this linear trend. IMO the multidecadel cycles corelate very well with the oscillatory postive and negative phases of the PDO, ENSO, NAO, AMO. The late 20th century trend i.e. the 1970 to 2010 trend is not statistically different from the previous 1940 to 1970 warming period trend.

    The 1970 to 2010 trend is not ‘unprecendented’ even with the last 100 years let alone the last 1000 years as the Climategate paleoclimatologists (Mann, Briffa et al) would have us all believe. In fact the 1970 to 2010 trend differes somewhat from the 1910 to 1940 trend in that the latter shows continuous warming over the 30 year period. In contrast the 1970 to 2010 period trend consists of two distinct periods (1970 to 1990 and 1990 to 2010) in which most of the warming trend occurs during the 1970 to 1990 period. There is therefore no reason why anyone (including so called climate scientists) should have to invoke man-caused CO2 emissions as the primary cause for the late 20th century warming trend. At most CO2 may be causing a slight additional warming trend over and above that due to natural climatic cyclic variability.

    To see the ‘intercative maps’ for the NASA GISS raw dataset clcik on the following link

    GISS raw dataset interactive maps

    and click on the ‘green’ links to see maps for each of the time periods. Note it can take a while for all the data to be loaded into the maps and its likely that you may receive a number of prompts. If so then click on ‘No’ until all the data has been loaded into the inteccative map. Note that the ‘dark red’ dots are for stations that show a warming trend of > +5 deg.C/century and the ‘dark blue’ dots are for stations that show a cooling trend of < -5 deg.C/century. To see charts of raw/adjusted data for an individual station, just click on the coloured dot. A separate browser window will then open up showing charts with trends for the four separate time periods and for the overall 1880 to 2010 period.

    In particular pull up and contrast the 1910 to 1940 warming period with the 1970 to 1990 and 1990 to 2010 periods. After 1990 the number of reporting stations in the NASA GISS raw dataset reduces from over 5500 (pre-1990) to less that 1500 (post 1990). In particular notice how almost all the ‘dark red’ Canadian stations on the 1970 to 1990 trend map as well as most of the Russian (Siberian) and Chinese stations have ‘dropped out’ from the 1990 to 2010 trend map. So in fact most of those ‘dark red’ dots on the 1970 to 2010 trend map are in fcat sttsions for which there is no data in the NASA GISS raw dataset post 1990, yet according to GISS we are supposed to believe that the planet has continued to still warm over the last 15 years. Well clearly it hasn’t in the western US, Spain, France and most of Japan and Australia.

  17. Basil said

    This is just impressive work. Will there be a time when one could access the end result, i.e. the monthly data as plotted in the first graph?

  18. BarryW said

    KevinUK

    What was the underlying trend that you calculated? I’m guessing at about half of what has been touted or maybe less.

  19. Basil said

    #13 Ausie Dan
    #16 KevinUK

    Besides a long term trend, and any multidecadal cycle you may see in the long term instrumental record, there are two well defined shorter cycles with periods of approximately 110 months and 248 months. Have you pick up on that in any of your analysis?

    This is why I asked in #17 if the monthly data of Jeff’s version will be available: I’d like to see if it evinces the same underlying periodicity.

    To put things in a visual context, here are two graphs, of the rate of change in the HadCRUT3 series:

    The left pane, with more smoothing, shows what may be the same multidecadal periodicity the two of you mention. The right side, with less smoothing, shows more clearly the shorter periodicity.

    What I would like to do is subject Jeff’s results to the same method of analysis, and see if produces the same general picture(s).

  20. KevinUK said

    #19 basil,

    What monthly data are you after?

    You appear to have the HadCRUT3 data, do you want the NASA GISS (or NOAA GHCN) raw monthly data. If so then I can supply it. If you are familar with the NOAA GHCN v2.mean file then I can supply you with a zipped version of the v2.mean_comp file that contains the monthly data for the NASA GISS raw dataset. This dataset is very similar to the NOAA GHCN raw dataset, but all the US station data have been replaced by the ‘pseudo-raw’ data from the USHCNV2 dataset. Also data for the Antartic stations in the SCAR dataset has also been added to this dataset.

    If you are interesting in comparing the raw GISS data with what GISS call their ‘homogenised’ (i.e. after they have applied their GIStemp ‘step 1 and step 2’ adjustments) I can also provide that in the same format as the NOAA GHCN v2.mean_adj file.

    Let me know if you (or anyone else) want this data (raw, adjusted or both) and I will zip it up and add some links to it on this thread.

    Until, you post here I wasn’t aware of the shorter period cycles in the data, so thank you for that. Would it be possible to do your analysis for different subsets of the HadCRUT3 dataset. FFor example separtely for the Northern and Southern Hemispheres, for the central US and the western and eastern US stations separately etc. There is clear evidence in the raw data that the warming/cooling trends (when the start/when they end etc) are different depending on which part of the world/continent you look at. For example there is evidence that the climate in and around the countries bordering on the Caspian and Black Seas are almost in ‘anti-phase’ with the countries surrounding them i.e. they warm when other countries cool and cool when the other countries warm.

    There is clear evidence in the raw data of wholely natural climatic variability yet climate scientists refer to this as ‘noise’. It’s not noise its natural cyclic climatic variability and the rates of warming/cooling due to these natural climatic cycles are an order of magnitude greater than the underlying (nothing to get too excited about) warming trend due to our planets recovery from the LIA. As we also know the LIA was preceded by a number of periods of multi-centennial long periods of general warming followed by cooling followed by warming which have included (during the Holocene period) the Holocene Climatic Optimum, the Roman Warm Period and the Medieval Warm Period.

    There is nothing unusual whatsoever about our recent late 20th century warming period, yet climate scientists would have us all believe that our planet is in danger and that we must ‘act now’ in order to avoid future catastrophic climate chnage due to our continued use of fossil fuels.

  21. Layman Lurker said

    After this version, it’s time to start messing around with different sorting methods, datasets, regions and those sorts of things.

    Suggestions🙂 1) difference plot of land only GISS and/or HadCrut – offset GCHN… and 2) spatial gridded map of differences.

  22. Layman Lurker said

    typo – should be “GHCN”

  23. Basil said

    #20 KevinUK

    “What monthly data are you after?”

    In this case, Jeff’s. But I’ll gladly look at any monthly series, to see if the same patterns emerge. I’ve already looked at GISS, and they do, but it was not the raw set you refer to (v2.mean), so it would be interesting to look at that. If the zip files you refer (for the monthly data from v2.mean, and v2.mean.adj) are not too large for email, my email is blcjr2 at gmail dot com.

    “Would it be possible to do your analysis for different subsets of the HadCRUT3 dataset.”

    Yes, it is easy enough to do. I’ve done it in the past, and would just need to get the data through 2009. It is available not just for NH vs. SH, but broken out by other latitude bands (tropics, southern or northern extratropics). When I last looked at this, the cycles get murkier as you look at the data on a regional basis, but I do not find that surprising. Regional climate is going to vary more than global climate, the latter averaging out a lot of the regional variability. While I think the regional patterns are important, I think the global pattern is important as well. I think the global pattern, with periods of 248 and 110 months, likely reflect an interaction between the solar cycle and the lunar nodal cycle. These cycles are going to be less clear in the regional data, where various other climate oscillations perturb the periods. But the bidecadal period (i.e. roughly 20 years or so) appears in lots of climate records all over the world, even when the data has other periods and cycles from regional influences.

    I agree with the general tenor of your other comments — e.g. “There is nothing unusual whatsoever about our recent late 20th century warming period…” Until the late 1990’s, there was a strong research agenda in climate science on trying to understand natural climate variability. That research has taken a back seat to the AGW agenda. I’d like to see it come back sometime.

  24. Layman Lurker said

    Also, a plot of differences between raw and offset data may be revealing. Your previous post on raw data shows almost an identical trend (since 1978) to the offset data, however it looks like the differences over the same time period would show a much more interesting picture then merely trendless noise.

    https://noconsensus.wordpress.com/2010/02/15/ghcn-gridded-temperature-data/#more-8066

    The spatial comparisons are also curious. Check out these comparisons (between raw and offset): Hudson’s Bay; US/Canadian border; Alaska. The area extending from southern Europe to continental Africa has about half a dozen adjacent gird pairs with -1 cooling next to +1 warming not seen at all in raw GHCN.

  25. Jeff Id said

    #24, I saw a lot more differential from one grid to the next also. It’s probably worth looking at an individual gridcell and the offsets used.

  26. Jeff Id said

    #23, If you want the monthly output the software above is easy to run from R. Otherwise, I could send you the output timeseries by email.

  27. Espen said

    #10 & #11: I’ve written many thousand lines of lisp code, and never counted parentheses. That’s what you have code-aware text editors for…

    I’m sure Greenspun’s tenth rule applies to R as well🙂

  28. Basil said

    #25 Jeff

    I know I should get up to speed with R, but haven’t had the time. Maybe when I retire from full time work later this year. Meanwhile, I crunch away with gretl. As for sending me the monthly series, that would be great. I think it would be very useful to see how your results hold up with the kind of processing I’m doing for climate cycle analysis. As I said to KevinUK, my email is blcjr2 at gmail dot com.

  29. Jeff Id said

    #28, I found another problem in the code which caused data to be unused. I’ll do another post shortly. It’s actually Roman’s method which keeps me going with this. It’s superior IMHO to the others and has an opportunity to improve the science.

  30. Tim said

    #29 – Sound like useful sensitivity test. If the missing data makes a large difference to the result that would have implications when it comes to station drop outs etc,

  31. kdk33 said

    Just to be a kill-joy.

    I think global average temperature parameters are meaningless. No-body/thing/creature experiences this. In fact, The averaging exercise destroys a lot,if not most, of the information.

    The only exercise I find meaningful is to identify long lived quality data stations. That uncorrected data are then THE DATA – no other permutations allowed.

    We use those data to answer questions: Is it warming? Everywhere or only somewheres? All at the same time or at different times? At the same rates? More in cities or rural? More over land than over sea? Or is it just random? Etc. etc., you get the idea.

    I see no point whatsoever in averaging. The data are so poorly distributed, geographicaly and over time, that figuring out a proper “averaging technique” with various fill-in-the-blank issues and duplicate-station issues (not to mention homogonezation and other non-sense) that it’s easy to lose site of the point: Measuring Temperature. Which ought be a rather simple exercise if we just left well enough alone.

    Alas, I feel I am alone. And I’m no expert.

  32. Tim said

    #31

    The GMST is used like a proxy for the earth’s energy balance. i.e. if the amount of energy in the atomosphere increase the GMST should increase.
    IOW – it is a proxy – like tree rings – for the phenomena that we really want to measure.
    Scientists like Peilke Sr. think it is a bad metric to use but it is the convention.

  33. kdk33 said

    IOW – it is a proxy – like tree rings

    food for thought, no?

    (sorry, couldn’t help it)

  34. Tim said

    #33 – yep. the GMST is as meaningful as the width of a tree ring.

    here is a paper that discusses how the air temp 5 ft above the surface is a baised proxy for the phenomena being measured.
    http://pielkeclimatesci.files.wordpress.com/2009/11/r-345.pdf

  35. RE27.

    I didnt have that luxury. The guys in the AI group had some special lisp machines, cant rmemeber what they were called… ah symbolics.

  36. Ruhroh said

    Re; @27;

    Yeah, there was this cool way to punch a special control card and wrap it around this drum thing to automatically tab and format.
    My favorite part of programming was the automatic keypunch machine… Probably there was a way to have it manage the parens, but I never figured it out…

    Hey Jeff;
    2 questions.
    1. I’ve gotten various hints that most of the ‘GW’ is manifested in the winter Tmin values. Are you seeing that here also?

    2. I keep hoping someone’s mondo-grid-o-matic program will be sufficiently clean/fast that an Electrical Engineering ‘sensitivity’ analysis can be run, showing how sensitive the output is to imprecision in a particular ‘element’ of the network.
    Roman may have said that it was called ‘co-variance’ in statistics jargon, but then ~suggested it was nontrivial.

    How heroic would it be to make some kind of ‘script’ thing that methodically does variation of inputs to give a metric of the sensitivity? Maybe you don’t have much here without the 1200km ‘outreach’ feature in temporally/spatially ‘sparse’ areas of the world.

    Dayjob impingement is curtailing my already-weak code-reading ‘skill’ even with lovely commented stuff like yours here.

    TIA
    RR

  37. Bob Koss said

    KevinUK,

    Great job with the station mapping. I’ve been paying close attention to Giss the last couple months. They are playing around with their adjusted files. I just checked to see what you had for Antananarivo. What Giss has now is longer at both ends.

    I just finished a short write-up about what is going on at Giss. Haven’t had it posted anywhere yet. It’s ridiculous. I’ll email it to Jeff, maybe he’d be willing to post it up. I don’t think people are fully aware of the state of their program.

    I’d be interested in getting any of the Giss data you have. I hate having to constantly access stations individually.

    Jeff, check your email tomorrow. I’ll be sending you something.

  38. Manfred said

    (warming over land / warming over ocean) is about 1.5.

    http://www.agu.org/pubs/crossref/2007/2006GL028164.shtml

    Addiionally, warming in he troposphere at 600 mBar should be a few 10% higher than on ground.

    best comparison would be with land only data from GISS, CRU or NOAA as already said.

  39. Basil said

    #29 Jeff

    No hurry. When you have something you are satisfied with, then will be the time for me to see how it compares with respect to periodicity.

  40. KevinUK said

    #37 BK,

    That’ll be Antananarivo, Madagascar then? Gavin (I don’t watch the Superbowl) Schmidt must be keeping a close eye on the ChiefIO and DigginintheClay blogs then or perhaps its just coincidence. Have you noticed them filling in any missing Bolivian data at the same time?

    #31 kdk33

    “I think global average temperature parameters are meaningless. No-body/thing/creature experiences this. In fact, The averaging exercise destroys a lot,if not most, of the information.

    The only exercise I find meaningful is to identify long lived quality data stations. That uncorrected data are then THE DATA – no other permutations allowed.”

    I totally agree with you on this one. The whole process of attempting average temperatures from different stations is flawed IMO and the idea that you can detect a global trend in some meaningless parameter like mean global surface temperature anomaly as a consequence is utterly flawed.

    Let’s face it this is only done so that the models can have something to attempt to demonstrate their hindcasting ‘skill’ against – and they are pretty hopeless at even when they invoke mythical cooling forcings like sulphates to try to explain the 1940s to 1970s cooiling period. While I’d accept thatther eis a need to adjust the data to allow for warming biases such as the UHI effect, equipment chnages etc, there is no need to anomalise and grid the individual station data in order to see very clear spatial and temporal warming and cooling trends. IMO the primary reason as to why organisations like NCDC, GISS and CRU anomalise and grid the individual station data is to obfuscate (hide) the clear wholely natural cyclic trends in the data.

    There is no such thing as global warming. It perfectly possible as the raw temperature data maps show for significant cooling to occur in an adjacent region to that were a warming trend is occuring at the same time. What significant warming there has been toward sthe later part of the 20th century, has largely been in the Northern Hemispshere and largely during the winter and early spring in the mid-latitudes. IMO there is clearly something fishy about the ‘claimed to be’ raw Canadian, Russian and Chinese data. Until such time as the data post 1990 has been added to the GHCN raw data (from which all 3 of the main MGST anomaly indices are derived). It cannot even be concluded with any level of certainty that the late 20th century warming trend was as signficant as the 1910 to 1940 warming trend, let alone that it was ‘unprecedented’ as Mann and Co. would have us all believe.

  41. Jean said

    KevinUK — When using the various datasets, what GIS toolsets are you having the most success with. I would like to move beyond gridding the data by political boundaries or grid squares, and use a topo/terrain function for grouping station data.

  42. SteveWH said

    Jeff

    Help!!

    Everyone raves about R so I decided to give it a try by using your script above. Downloaded data from v2.mean as zip file (expanded it)and copied/pasted v2.country.codes and v2.temperature.inv into notebook and put them in C:/agw/GHCN/ as v2.mean, v2.country.codes and v2.temperature.inv.

    Copied/pasted your script into R. Named it agw.R

    Tried to run whole script got:

    > source (“/agw/agw.R”)
    Error in source(“/agw/agw.R”) : /agw/agw.R:9:9: unexpected symbol
    8: 008
    9: 009 psx.inv
    ^
    Tried to run just one section (line 79 to 95) and got:

    > source (“/agw/justdata.R”)
    Error in source(“/agw/justdata.R”) :
    /agw/justdata.R:6:9: unexpected symbol
    5: 079 ######## data load ########
    6: 080 loc
    ^
    Now, it is obvious that I know nothing about R. It seems obvious it is not as simple as cut/paste. Can you help me just get started here?

  43. Jeff Id said

    You need to put your drive name in the “”

  44. Jeff Id said

    At least that’s what I suspect. Did you download the gui version of R?

    I will be on a plane for the rest of the day and unable to answer. Perhaps someone else will have time.

  45. SteveWH said

    Jeff

    Put source(“C:/agw/agw.R”) – same result.

    Yes downloaded gui version for windows.

    Thanks for the reply. Have a good flight.

    Any other R gurus out there?

  46. KevinUK said

    #41 Jean,

    I decided not to use any of the ‘GISS toolsets’ particularly Makiko Sato’s gridding tools as they are clearly in error. Instead what I’ve done is write my own software for processing the NOAA NCDC and NASA GISS raw and adjusted datasets which I’ve called TEKtemp.

    I’m not doing my own adjustments/homogenisations or anything like that, I’m just taking their data (in the form of v2.mean and v2.mean_adj files), importing it into a MS Access database, combining the (often several) ‘duplicate series’ for any given combination of WMO station code and WMO modifer i.e. a ‘station’ as they define it in their station inventory (v2.temperature.inv) file and then fitting a linear regression trend to different time periods for each individual station.

    I apply my own quality control in regard to how I derive the seasonal mean and year mean values from the monthly data as I don’t trust the way in which NCDC and GISS calculate their seasonal and year means when therr is missing monthly data for a given year. In my case I reject the whole years worth of data and do not calculate seoanal means or a year mean for that year if there is any missing data for any month. This is a bit tight I know but IMO it’s the correct thing to do. I also do not fit a linear regression trend if there are less than 10 years of data available for the chosen time period for the given station. So for example if the time period is 1910 to 1939 then I will not fit a trend line if data is only available from say 1910 to 1913, then a gap and then from 1930 to 1934 a sthay would only be nine years of data out of a posiible 30 years of data. I do not take out any outliers or look for discrete changes in the data etc so as you can see often there can be large (> +/- 5 deg.C/century) fitted trends.

    On closer inspection I’ve found that many of these large positive/negative trends appear to be ‘real’ in that they are also evident in many near by stations. It therefore appears that rapid (often cyclical) climate change is not unusually, yet we are supposed to be concerned about a general warming trend as a result of recovering from the LIA of 0.6 to 0.7 deg.C/century.

    I’m sure if the polar bears and various other supposedly ‘at risk’ species due to climate change can adapt to a change in temperature of more than 0.5 deg.C over a decade then surely they can happily adapt to the same change over 100 years even if this projected increase only exists in the virtual world of the climate modellers and their positive feedbacks and is unlikely to ever occur in the real world.

    Tha ‘interactive maps’ I have created have all been done using a ‘free for educational use’ tool called ‘DIY maps’.

    http://backspace.com/mapapp/

    The software loads data in an XML file into a Flash shockwave file. It possible to put ‘coloured dots’ on the map that represent a point value (in my case warming/cooling trend) at a given Lat/Long co-ordinate location. It also possible to define regions on the map e.g. countries’ and colour code them also. The shockwave file allows you to zoom into the map and to pan up/down and left/right. If you click on a given ‘dot’ then you can cause an ‘action’. In my case the action is to display a new browser window showing charts and summary tables of raw/adjusted data trends for that station.

    I personally don’t think there is much to be gained (in fact the opposite) by ‘anomalising and gridding’ the individual station data and showing ‘anomaly trends’ relative to some arbitrary ‘reference period’ (e.g. 1951 to 1980 for GISS) in the form of a colour contour map or some notional anomaly chart. As many have shown, including Jeff ID this can lead to ‘spurious impressions’ of warming in certain parts of the world e.g. Bolivia or Madagascar. IMO is much better to just look at the spatial and temporal trends by looking at the trends for individual stations. That way the natural cyclic climatic variability (not ‘noise’) is very evident.

  47. Jean said

    Thank you KevinUK,

    I clearly do not like the “gridding” methods employed from several perspectives. My intention is to use the published data – warts and all – and illustrate the results of a variety of spatial accumulations.

    Obviously, someone could apply other datasets to see the results. I like the DIY Maps tool for presentation, and have queried the developer about implementations that use other, non-political, sections. In your experience was it easy to use and well documented?

    It might be interesting to map the data “adjustments” on a political map. I am certain that it would drive blog traffic.

  48. KevinUK said

    Jean,

    Great minds think a like as the saying goes🙂.

    Yes, the DIY map was very easy to use and is well enough documented IMO as it is a relative straightforward Flash application. You’d be surprised who actually uses it?

    http://cdm.unfccc.int/Projects/MapApp/index.html

    Yes the UNFCCC! I always like it when I get to use a tool that the ‘opposition’ is using to refute their claims. NOAA and GISS are hardly in any position to criticise my analysis when in fact it uses their actual raw data, sourced from their web site and mapped using an application endorsed by the UNFCCC.

    Because the DIY map uses Actionscript (and interpreted scripting language and runs with the Adobe Flash browser plug-in) and is reading XML files, it is rather slow in reading the data. As a consequence when ther is a lot of data to be input and displayed e..g for the 1880 to 2010 period the Shockwave Flash player tend sto throw up a number of annoying ‘its taking a long time’ prompts. I’m sure this is putting off some people who are visiting the map pages. Other than that I think its an excellent mapping tool and very easy to use. The tricky bit is getting the data you want to map in the requited XML format, but the documentation tells you how to do that. It’s certainly better than GISS’s (Makiko Sato’s) somewhat pathetic effort.

    Another alternative is to use the Google mapping API or the Microsoft Virtual Earth API which has the benefit of incorporating all the satellite images. These are used by

    CDIAC for the USHCN – http://cdiac.ornl.gov/epubs/ndp/ushcn/ushcn_map_interface.html

    and

    Appinsys for the NOAA GHCN and HadCrut datasets -http://www.appinsys.com/GlobalWarming/climap.aspx?area=crutem3grids

  49. RomanM said

    Jeff, just for the record, I upgraded the offset method to be able to handle the weighted combination case in a proper manner. The R stuff is available here:

    http://statpad.wordpress.com/2010/03/04/weighted-seasonal-offsets/

    The way I set it up, this function will seamlessly handle the unweighted case if you run your earlier unweighted scripts as well.

  50. Alan S. Blue said

    Jeff,

    Is it possible to do the comparison with UAH on a per gridcell basis? That is: a map like figures three and four with “GHCN trend minus UAH trend” as the focus.

  51. Jeff,

    Where did you find the errors in gridding/weighting that reduced the temp? I’m still getting results from my model similar to your initial ones (0.258 C per decade 1978-2009 trend in annual anomalies) which is lower than the NCDC land temps (0.294 C per decade) but higher than GISS land temps (0.190), so I’m keen on nipping any bugs that might have cropped up.

  52. Jeff Id said

    Zeke,

    I’m using an offset value method of combining anomalies. See the statpad link on the right. It helps reduce step effects of adding and removing stations. The early version here was rejecting a substantial fraction of the data prior to averaging. It happened b/c of an if/then which accidentally eliminated whole series. I’ve nearly finished a new version but probably will take a day or two to get posted.

  53. Hmm, I wouldn’t think the differences between the offset value method and CAM could impact the trend quite that much. I do need to work on improving the way I combine multiple imods into single station_ids; I’ll see if that changes my results significantly.

  54. […] Air Vent I’ve been searching for other discussions. Jeff Id and Roman R appear to be working on creating temperature anomaly time series using “GHCN and Roman’s […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
%d bloggers like this: