I originally posted on the Forster and Gregory 2006 method for determining climate sensitivity back in June. Since then, I’ve come across a number of issues (mostly discovered by others) on why this method may not constrain the sensitivity with the accuracy previously assumed. I will likely need to create a page of resources on this alone, because the number of links to blog posts and papers could easily grow unmanageable if I were to repeat them in every post. However, for now I’ll mention the recent Science of Doom and Isaac Held posts, and then begin listing some of the difficulties I see (particularly without using a limited time period):
- The unknown radiative forcing. This is Dr. Spencer’s main objection (SB08, SB10, and part of SB11), which suggests the forcing will lead to an underestimate of the response (and hence and overestimate of sensitivity) if it is correlated with T. Murphy and Forster 2010 argue that the effect is small, but it doesn’t look like their use of such a deep mixed layer was appropriate. Science of Doom has a good introduction to this issue in the post I link to above.
- There is not a strong reason to believe that the radiative response to a temperature change is both constant across all seasons and linear (also briefly mentioned in the SoD post). However, it theoretically could be a reasonable approximation.
- Dessler 2010 concludes that there is unlikely to be a correlation between the short-term/instantaneous cloud feedback and the long-term cloud feedback (at least, it’s not present in models). The set-up of FG06 implicitly only includes the short-term feedback, and since the cloud feedback is one of the biggest uncertainties surrounding equilibrium climate sensitivity, it suggests that the FG06 method could not necessarily diagnose the ECS in this case.
- There is a timing offset between the sea surface temperature changes and the bulk of the tropospheric temperature changes (1 – 3 months). When using a smaller period with interannual variations greater than the long-term trend (e.g., the CERES era), and with the bulk of the Planck response coming from the atmosphere, the timing offset would lead to an underestimate of the response.
- Using the method in the control runs of models generally leads to a large overestimate of the climate sensitivity (which suggests major issues with the method, the models, or both).
- Errors in the surface temperature measurements will yield underestimates of the response (and overestimates in the sensitivity). I specifically point to the surface temperature measurements rather than the satellite flux measurements because they act as the independent variable in the regressions, meaning they’ll lead to regression attenuation using traditional OLS methods (assuming Gaussian white noise for the flux measurements, we would get a lower correlation but not necessarily an underestimate).
- The sampling error using only (for example) 10 year periods could make it difficult to diagnose accurately.
For this post, I will briefly mention #5, and then use the GFDL 2.1 500 year control run from PCMDI to explore #6 and #7.
General Inaccuracies using the GCM Control Runs
For point #5, I will point to a perhaps unexpected place…figure 2 of Dessler 2011:
The black lines are from the control runs of the models. Note the regression slopes at 0 lag (which corresponds to the FG06 method), which I’ve circled in green. Now, the average ECS of these models we know to be about 3 K, which corresponds to a radiative response of about (3.8 W/m^2 / 3 K) = 1.27 W/m^2/K for the radiative response, a number that we’d expect to see as the average regression slope. But instead, the average regression slope is closer to 0.5 W/m^2/K, which corresponds to a whopping 7.6 K ECS, more than double of what is known for these models. Using the FG06 method in the control runs of the models thus overestimates the sensitivity in what appears to be 12 out of the 14 models. Dr. Spencer goes into some more issues of testing this against models.
Anyhow, there appear to be two models that actually show reasonable results when using the FG06 method. From the Trenberth, Fasullo, and Abraham response, it appears that one of them is ECHAM_MPI. The other one looks to be GFDL CM2.1 from my tests:
Closer Look at the GFDL CM2.1 Control Run
The GFDL CM2.1 has a ECS of 3.4 according to the IPCC AR4 table. This corresponds to a response of 1.12 W/m^2/K, which is almost the exact value I get when using the 500 years of the control run and monthly temperature anomalies (r^2=.09). Of course, it is curious why the correlation would be so low if the FG06 method uses an appropriate model, considering there is no measurement noise in the flux or temperature outputs from the model.
Anyhow, before continuing further, I’d like to show a chart of the control run global surface air temperature, which has got me scratching my head a bit:
Now, my understanding is that the pre-industrial control experiment does not include any change in forcings, and that the model is run to “stabilize” prior to the start of the control experiment. The gray lines are the monthly anomalies, while the black line is the 30 year moving average. Note that we see climate-scale trends (using the 30 year averages) that are completely unforced (if I’m understanding it correctly); this is not “natural variability” in terms of solar or volcanic variation, but rather in the “no TOA forcing changes” sense. Whether this is simply model drift, or is actually supposed to be simulating long-term, unforced variability, I’m not sure…it’s on the scale of 0.2C for what appears to be some < 75 year periods, which seems like it could be significant compared to the 20th century rise.
Anyhow, I will proceed with some different trials in order to diagnose the climate feedback in the model’s control run and compare it to the known value. First, I’ll note that using annual averages over the 500 year period gives me a response of 1.37 W/m^2/K (r^2 = 0.26), which would be an underestimate of sensitivity. Dr. Held, in his response to my comment on his blog, mentioned that using 1000 years (which I don’t think was available at PCMDI), he got response of 1.7 W/m^2/K. I was a bit surprised not to match his results, since 500 years seems like it would be enough to constrain it, until I broke it down into two 250 years periods and found responses of 1.56 and 1.7 W/m^2/K, which seems to suggest that there are periods of temperature change that are not met with corresponding radiative responses (perhaps this is what allows for the continuing drift?).
Using Absolute Values Rather Than Anomalies
Anyhow, in Murphy et. al 2009, they extend the FG06 method to use CERES monthly observations. One curious point is that in figure 2, they use interannual AND seasonal variations (that is, they don’t take monthly anomalies) to show the radiative response. Steve McIntyre has explored this as well. The result is that you get higher r^2 values, but I think this may inflate the confidence we should have in the Murphy (2009) method. After all, it doesn’t seem that seasonal changes in temperature and the radiative response would necessarily be equal to longer term changes, and in fact using absolute values for monthly temperature anomalies and fluxes with the 500 year control run results in a response of 4.55 W/m^2/K (way higher than the known long-term value) with a r^2 = 94%! However, that 94% clearly is not indicative that it is accurately reflecting the ECS. Of course, such a high value may only suggest that the GFDL CM2.1 model overestimates the flux response to seasons, or underestimates the seasonal temperature response.
Regression Attenuation from Uncertainty in Surface Temperatures
I added white noise with a standard deviation of .075 C into the surface air temperatures to simulate measurement/sampling noise, based on estimates of the uncertainty presented in the charts of the HadCRUTv3 paper towards the early 21st century (during the CERES era). This resulted in the expected regression attenuation in the 500 year monthly test, bringing the estimated response down to about 1.02 W/m^2/K, and thus yielding a slight overestimate of the sensitivity. It is curious that the use of TLS to avoid this attenuation is not examined in more detail. For instance, FG06 mention:
For less than perfectly correlated data, OLS regression of Q-N against T_s will tend to underestimate Y values and therefore overestimate the equilibrium climate sensitivity (see Isobe et al. 1990).
The reason main reasons that they give for sticking with OLS is two-fold: 1) the issue of cause and effect (T_s inducing the radiative flux changes at short time scales, not the opposite), also discussed in MF09, which is in dispute by Spencer in point #1 of the post, and 2) that using it on the model HadCM3 does not yield accurate results using different regressions.
I will point out that even IF no unknown radiative forcing is confounding the feedback signal in this way, this still does NOT mean that there is no “error” in the independent variable…as shown above, HadCRUTv3 includes estimates of uncertainties. Furthermore, that another method correcting for these errors (both in terms of cause and effect and measurement error) would yield overestimates of the response in a climate model that includes neither these specified variations in cloud forcing nor measurement error is unsurprising, but it says nothing about whether a different regression method is appropriate for real world data that DOES include such errors. Finally, I found the following statement from FG06 appendix quite interesting:
Another important reason for adopting our regression model was to reinforce the main conclusion of the paper: the suggestion of a relatively small equilibrium climate sensitivity. To show the robustness of this conclusion, we deliberately adopted the regression model that gave the highest climate sensitivity (smallest Y value). It has been suggested that a technique based on total least squares regression or bisector least squares regression gives a better fit, when errors in the data are uncharacterized (Isobe et al. 1990). For example, for 1985–96 both of these methods suggest YNET of around 3.5 +.- 2.0 W/m^2/K (a 0.7–2.4 K equilibrium surface temperature increase for 2 x CO2), and this should be compared to our 1.0–3.6-K range quoted in the conclusions of the paper.
Murphy et. al (2009) explore orthogonal regression a bit as well, but I couldn’t find anything that explicitly takes into account the known uncertainties in the surface temperatures.
Sampling Error in 10 year intervals
Finally, I will look at the different estimates we get for the climate radiative response when breaking the 500 year control run into 50 10-year periods (and still including noise). I set the standard deviation for Net TOA flux measurements to 0.33 W/m^2 per month when adding in the white noise (based on estimated CERES RMSE (SW + LW) / sqrt(2)) The red lines represent the “true” radiative response. Using monthly values, we get the following results over 10 year periods:
Based on those responses, this includes climate sensitivities from 1.7 C to 14.2 C (!) based on a 10-year period, when the known sensitivity is 3.4 C.
For annual data (which should reduce some of the measurement noise, but yield a smaller sample size), we get:
Which includes everything from 0.8 C to 13.1 C.
Interestingly, these sampling errors tend to lead towards overestimates of the response, or underestimates in climate sensitivity. I’m not quite sure at the moment why this should be the case.
All code and data for this post is available here.