Troy's Scratchpad

February 23, 2013

Could the multiple regression approach detect a recent pause in global warming? Part 3.

Filed under: Uncategorized — troyca @ 11:29 am

Part 1

Part 2

In the first two parts of this series, I demonstrated how multiple regression methods that assume an underlying linear "signal" are unable to properly reconstruct a pause in surface temperature warming when attempting to remove the volcanic, solar, and ENSO components from my simple energy balance model.  That is, for an approach similar to Foster and Rahmstorf (2011), the method will tend to underestimate the warming influence of volcanic recovery and overestimate the cooling influence of solar activity over recent decades to compensate for the pause.  With the improvement Kevin C mentioned, there is some ability to detect a longer tail for the volcanic recovery (indeed, it does so nearly perfectly if the underlying signal is actually linear), and the solar influence is no longer over-estimated.  Unfortunately, it still underestimates the recent warming influence from volcanic recovery in my energy balance model in the "flattening" scenario 2.

I had thus wondered whether this long-tailed volcanic recovery was merely an artifact of my simple model, or indeed may have contributed substantial warming from 1996 (when the Pinatubo stratospheric aerosols were virtually gone) onward.  There are not that many models that have contributed volcanic-only experiments to CMIP5 (I showed 1 in my part 1, and Gavin showed an ensemble for GISS-EH at RealClimate in response to this discussion).  However, there is plenty of data from the natural-forcing only historical experiment, which, by averaging several of the runs for a particular model, can give us a good idea of the forced volcanic + solar influence in those GCMS.

In the figure below, I have shown the mean of the historicalNat runs for 7 individual CMIP models that have 4 or more of these experiment runs.  As such, this should give an idea of the forced response in these models without much additional unforced variation.  I have also plotted on the same chart the volcanic + solar influence as diagnosed by the FR11 and Kevin C methods when using the HadCRUTv4 dataset. 


As can be seen, the volcanic response in all of these AOGCMs is far larger and has a longer tail than diagnosed by the multiple regression methods.  Now, certainly it is possible that these volcanic responses in AOGCMs are too large, as there is evidence to suggest that the CMIP5 runs don’t properly simulate this response.  However, the fact that the FR method shows far lower sensitivity to volcanoes while simultaneously showing a much larger sensitivity to solar influences than both GCMs and simple energy balance models would indicate would seem to suggest that it may be compensation for the recent flattening.  Indeed, it is quite difficult to conceive of a realistic, physics-based model that does not indicate a substantial volcanic-recovery-induced warming contribution after 1996, despite it being virtually non-existent in the FR11 diagnosis (the increase around 1998 in the FR line is actually solar-induced).   

The table below highlights the warming contribution of the model ensembles (in K/Century, so be careful!) from the indicated start year through 2012 (I have an * by CCSM4 because the runs end in 2005).  



For comparison, the HadCRUTv4 trends over these same periods are

1979-2012: 1.55 K/Century
1996-2012: 0.91 K/Century
2000-2012: 0.38 K/Century

If one believes that this range of GCMs represent the true forced response of solar+volcanic, it would suggest that these natural forcings were responsible for 15% to 51% of the warming trend from 1979-2012.  If I had to bet, I would probably put it on the lower end, as the AOGCMs appear to be a bit too sensitive to these radiative perturbations and suggest too much ocean heat uptake, which probably creates longer tails on the early volcanic eruptions than is warranted.  However, I do think the contribution is probably greater than 0%, which is about what the FR method puts it at.  

From 1996 to present, and 2000 to present, however, are where I think we see the larger misdiagnosis.  Whereas all models (including my simple energy balance model) indicate that the solar+volcanic influence from 1996 to present was positive, comparable in amount (median: 0.81 K/century, mean: 1.05 K/century) to the actual HadCRUT trend, both regression methods either suggest a slight negative or nor-zero influence from these components.  And from 2000 to present,  while the models are more split (with only 6 of the 7 suggesting a positive influence, and the range varying more widely), it is difficult to believe that the actual influence of solar+volcanic is as strongly negative as the FR method indicates.  This is why it looks to me like the multiple regression method underplays the influence of volcanic recovery in order to partly compensate for a recent pause.

Essentially, we are left wondering if the GCMs are too sensitive to volcanic eruptions, and/or if the multiple regression method is underestimating their influence to compensate for a recent pause.  Again, if I had to bet, it would probably be in the middle – the GCM response is generally a bit too large, but the response is not nearly as small (or short) as the FR11 method would indicate.   

Data (including all of the globally processed tas for the models shown, please give credit here if you use them in this processed form) and code are available.


February 20, 2013

Could the multiple regression approach detect a recent pause in global warming? Part 2.

Filed under: Uncategorized — troyca @ 8:36 pm

Previously, I posted on the multiple regression method – in particular, the method employed in Foster and  Rahmstorf (2011) – and how, when attempting to decompose the temperature evolution of my simple energy balance model into the various components (signal, ENSO, solar, and volcanic), this method encountered two large issues:

1) It did not adequately identify the longer term effect of the volcanic recovery on temperature trends, and

2) It largely overestimated the solar influence.

If you recall, I tested two scenarios in that original post.  The first scenario was a linearly increasing underlying signal.  The second scenario was a combination of a linearly increasing signal and an underlying low-frequency oscillation, resulting in a flattening of recent temperatures (one that was not caused by the combination of ENSO, volcanic, and solar influences).  The goal was to see whether this multiple regression method could identify the flattening if it existed. 

Thanks to Kevin C, who  suggested and implemented a few improvements to this F&R method, noting them in the comments of that post: “…tie the volcanoes and solar together as forcings and fit a single exponential response term instead of a delay."  This would allow a tail for the recovery from volcanic eruptions well beyond the removal of that actual stratospheric aerosols, and would not allow an over-fitting of the solar influence.  After implementing this newer method, I would say that it is a large improvement (at least in diagnosing my simple EBM components) in the first scenario of a linearly increasing trend:


Unfortunately, due to the underlying assumption implicit in this method of a linear trend, it still has trouble identifying the recent pause present in scenario 2:


To see where exactly it is going wrong in scenario 2 vs. scenario 1, we can again look at the individual components:


Solar Volcanic

As should be clear, the improvements suggested by Kevin C generally improve performance across the board.  Unfortunately, in the 2nd scenario with the flattening, the multiple regression method still tries to compensate for the flattening by decreasing the diagnosed influence of volcanic recovery, therefore leading to a misdiagnosis. 

Dikran Marsupial noted in the comments of that last post that “there are no free lunches.”  Perhaps this helps drive the point home that assuming an underlying linear trend will lead to this misdiagnosis if the increase is not linear.  I hope to investigate further the actual influence of solar + volcanic activity on recent temperatures using some GCM runs.     

February 13, 2013

Our paper on UHI in USHCN is now published

Filed under: Uncategorized — troyca @ 4:56 pm

As you know, my first interest and the bulk of the early articles for this blog dealt with the question of the urban heat island (UHI) influence on U.S. historical temperatures.  Our paper is now available (pre-print version) on this topic, and Zeke (the lead author of the paper and the one who wrangled everyone together!) and Matthew Menne put together a good post on it over at realclimate.  

Apart from the use of several different proxies for urbanization, and the thorough treatment of many UHI-related topics, I personally think an interesting aspect of this paper is how it  delves into the potential issue of “urban bleeding” during homogenization.  For those that have followed various discussions on the topic over the past few years, or have read this paper already, it is clear that the UHI signal appears much more strongly in the TOB data than in the F52 homogenized data.  A while back I also had a post, using synthetic data, that showed how the F52 algorithm could potentially alias some of the heat from urban stations into rural stations, thereby removing the appearance of UHI without removing the UHI itself.

On the one hand, if you look at figure 9 in the paper, I think it confirms the concern that the homogenization process could potentially spread urban warming to rural stations, as seen in the urban only adjustments.  On the other hand, I also think it shows that in the case of USHCN v2, this effect is pretty minor based on using only ISA < 10% for adjustments.  Now, one might wonder about UHI spreading from stations with ISA < 10% (that is, whether these “rural” stations are not strictly “rural”, and are themselves are contaminated by the UHI).  Thus, I thought it might be interesting to show another couple of figures here, which shows the difference in the “urban” vs. “rural” trends based on what cut-off in the ISA classification is used to define “rural”:



As you can see, the bulk of the UHI signal in TOB comes from those stations with ISA > 10%, such that the use of 10% seems a pretty solid cut-off for “rural”.   

Nevertheless, for an additional demonstration, we can use only the most rural stations (< 1% ISA) from a dataset that has only been adjusted by other most rural stations (< 1% ISA).  Here is that final result when compared against the gridded F52 all-adjusted, as well as GISS:




From a visual perspective, it seems fairly clear to me that there is not much difference, and the numerical results below seem to bear this out for the most part.  The exception is one we discussed in the paper, where the USHCN v2 all may have some residual UHI in the early part of the record and require an additional adjustment (as the one used in GISS).

1960-2010 Trends

F52 all: 0.224 K/Decade

F52-ISA01-ruralAdj: 0.219 K/Decade

GISTemp: 0.208 K/Decade

1885-2012 Trends

F52 all: 0.072 K/Decade

F52-ISA01-ruralAdj: 0.054 K/Decade

GISTemp: 0.058 K / Decade

It is thus my opinion that the impact of UHI in the homogenized USHCNv2 is minor.  This paper does not specifically speak about the UHI influence on a global scale, nor does it specifically consider micro-siting issues.  However, my initial impressions regarding the homogenization lead me to believe that there is unlikely to be any strong micro-siting bias permeating throughout the USHCN dataset.

For those interested, Zeke has already linked to the code used for the paper here.   The specific tests I ran for this post make use of that Java code and data, and the R script for graphing and batch files (which can easily be converted to shell scripts) are available from me here.

Blog at