Troy's Scratchpad

March 15, 2014

Does the Shindell (2014) apparent TCR estimate bias apply to the real world?

Filed under: Uncategorized — troyca @ 9:35 am

There has been some considerable discussion of Shindell (2014) and the suggestion that usual estimates of TCR (which assume roughly equal efficacies for different forcings), such as Otto et al, (2013), might be underestimating TCR with the traditional method.  A few example discussing Shindell (2014) are at Skeptical Science, And Then There’s Physics, and Climate Audit.  SkS’s Dana went so far as to say the paper “demolishes” Lewis and Crok’s report at James Annan’s blog, but JA responds quite skeptically of the Shindell (2014) results.

On the face of it, the argument is fairly simple and intuitive (so buyer beware!): since the cooling effect of aerosols generally occur in the Northern Hemisphere where there is greater land mass and thus lower effective heat capacity, these forcings will disproportionally affect the global temperature relative to the forcing of well-mixed greenhouse gases, which acts globally.  Since Watt per Watt these cooling forcings will give you more bang for your buck, an estimate of TCR using only the globally averaged forcing and global temperature could be biased low.  Shindell (2014) therefore tries to find the average “enhancement” of aerosol+O3 forcings, E, through GCMs, and uses the following to calculate TCR from global quantities:

TCR = F_2xCO2 x (dT_obs / (F_ghg + E x (F_aerosols + F_Ozone + F_LU)))

However, there are reasons to be skeptical of this result as well.  For one, there *have* been studies that specifically looked at the rate of warming and certainly don’t assume homogenous forcings, such as Gillett et al (2012), which find low TCR estimates consistent with Otto et al., (2013).  Furthermore, Shindell (2014) does not seem to consider the ratio of observed warming in the NH vs. SH. ..using Cowtan and Way (2013) with HadCRUT4 kriging, from the base period (1860-1879) through the end of the historical simulation time period (1996-2005), the ratio of NH warming to SH warming is 1.48.  Obviously, if there was a large cooling effect from aerosols concentrated primarily in the NH (due to a large enhancement of the aerosol effect), we would expect to see more warming in the SH than the NH!  Third, there did not seem to be any tests run on the actual historical simulations from models, which would tell us how well the Shindell (2014) method performs relative to the “simple” E=1.0 method (e.g. Otto et al).  These last tests should be easy to pass, since the value of “E” would be calculated from the same model that the tests are run on (unlike the real world, where we don’t know the “true” value of E).

The first table simply shows the forcings and temperature changes using the same models as S14, as much of this information is available in the supplement.  These tests will be based on the difference between the base period (1860-1879) and the end of the historical simulation (1996-2005) using the historical runs.

TABLE 1. Temperature + Forcing from historical simulations and Aero/O3 Enhancement (from Shindell 2014)











































One thing that has been nagging me about this is that natural forcings are not included in the TCR equation above.  I am not sure if the slightly positive solar influence is balanced out by the slightly negative volcanic influence in models, or what, but S14 does not include estimates of these natural forcings in the models so I have not included them in the tests either. And here are the results of the actual tests using the above numbers: 

TABLE 2. Estimate of TCR using Simple (E=1.0) estimate vs. Shindell (2014) methods, along with NH/SH warming ratios


Simple Estimate (K)

Shindell Estimate(K)

TCR Actual (K)



NH/SH Hist


















































Observed (CW13)




Assuming I have not messed something up here, these results appear to be very concerning for Shindell (2014).  For example, IPSL-CM5-LR, the model from S14 with the largest “enhancement” at E=2.43, would be expected to yield a major underestimate of TCR using the simple method.  Instead, the simple estimate only underestimates TCR by 6%, whereas applying the S14 “correction” makes things far worse, yielding a 40% overestimate of TCR!  In fact, in 4 of the 6 models, the Shindell method overestimates the TCR by > 30%.  On the other hand, the “simple” method only underestimates TCR by > 30% in 1 of the 6 cases.  

Perhaps even more concerning, however, is the specific model ensembles for which the Shindell (2014) method largely overestimates the TCR.  Given the observed NH/SH warming ratio of 1.48, the two models that are most realistic in this regard are IPSL-CM5-LR (1.49) and the average of MRI, NOR, and MIROC (1.58).   Since the argument from Shindell (2014) essentially hinges on the NH being disproportionately cooled by aerosols relative to the SH, these are the most directly relevant.  And yet, using the “simple” method in these cases produces underestimates of 6% and 2%, which would hardly change the results of a paper like Otto et al, 2013.  On the other hand, the Shindell (2014) “correction” causes overestimates of 40% and 34%  (e.g. a shift from a 1.3K “most likely” value from Otto to the 1.7K reported by S14).  If we look at the model that the “simple” method largely underestimates, GFDL-CM3, we see that the 0.84 NH/SH in that model is the farthest away from the one we’ve observed in the real world, suggesting it is likely the least relevant. 

In fact, I would argue that the amplification of the NH/SH ratio in the historicalGHG simulation relative to the historical simulation for a model could be used to better estimate the TCR “bias” calculated using that model.  This is because the difference in the NH/SH ratio in the historicalGHG simulation and that of the historical simulation implicitly combines the actual aerosol forcing and the “enhancement” of this forcing (rather than trying to estimate these highly uncertain values separately), which is even more directly relevant to the degree of TCR bias.  Indeed, if we look here, there appears to be excellent correlation:




Impressive, eh?   Now, I will mention that I believe things to be less pretty than the r^2=0.92 value shown above.  This is primarily because the 3 models bundled in Shindell et al (2014) actually have very different rates of global warming, as well as the NH/SH ratios, so I’m not sure it makes sense to bundle them, but have done so here for consistency with S14. 

One problem with using my method here and applying it to the“real-world”  is that we don’t know what the NH/SH warming ratio would be for the real world in the GHG-only scenario.  However, given the high value of 1.48 observed for the real-world “all-forcing” case, I suspect that the difference between this ratio and the GHG-only scenario can’t be that large, unless models have seriously underestimated the historicalGHG warming ratios.  Moreover, I would argue that the value is likely to be better constrained from models than the value of E, which depends on the much more uncertain aerosol properties.   Regardless, the average model warming ratio for NH/SH for the historicalGHG simulations in the above models is 1.54.  If you plug in the observed value of 1.48 for the “historical” observed NH/SH ratio in the real world, and use the linear regression from above, the estimated TCR bias amplification factor is 0.96.  This would suggest using the “simple” method slightly overestimates TCR, but by an extremely small fraction. 


While Shindell (2014) uses several GCM results to argue that traditional methods to calculate TCR lead to an underestimate, testing these methods against the outputs of those same GCMs seems to suggest the “simple” (E=1.0) methods perform better than the S14 “corrected” method.  Furthermore, when we consider the actual observed NH/SH warming ratio, it also seems to suggest that the TCR bias in the traditional/simple method is either very small or non-existent. 

Data and code.

March 11, 2014

How sensitive are the Otto et al. TCR and ECS estimates to the temperature and ocean heat datasets?

Filed under: Uncategorized — troyca @ 10:59 pm

There seems to have been some interest in the sensitivity of Otto et al. (2013) lately.  For instance, see Piers Forster’s comments about HadCRUT4 in Otto, or Trenberth and Fasullo (2014), which notes that using ORAS-4 for the OHC dataset raises the estimate of ECS from 2.0 to 2.5 K.   Anyhow, I ran a few of the tests, and thought I would share the results. 

Otto et al (2013) uses five different intervals over which the differences of temperature, forcing, and TOA imbalance (for the ECS estimate) are calculated: 1) 2000s – Base, 2) 1990s – Base, 3) 1980s – Base, 4) 1970s – Base, and 5) the 40 year interval from 1970-2009.  The base period used is 1860-1879.  Here are the results when using the BEST or Cowtan and Way (2013) global temperature datasets (which infill under-sampled regions to get more global coverage)  to calculate TCR alongside HadCRUT4:



ΔF (W/m^2)



CW13 ΔT (K)

CW13 TCR (K)











































In most situations, it appears that the difference is relatively small, adding ~0.1K of TCR when using either Cowtan and Way (2013) or BEST global temperatures.    
For ECS, I used Cowtan and Way (2013) for the temperature dataset and three different ocean heat content datasets: 1) Lyman and Johnson (2014), 2)  ORAS-4, the reanalysis product introduced in Balmaseda et al. (2013), and 3) Levitus et al (2012).  For #1 and #2, I digitized the values from their respective papers.  Since the calculation of ECS is slightly more complicated, I have included additional steps along the way.  OHU represents the rate of ocean heat uptake as calculated over the period from the observed ocean heat content,  and ΔQ represents the change in TOA imbalance over the interval (I assume 90% of the TOA imbalance goes into the ocean, and per Otto et al that the imbalance of the base period was 0.08 W/m^2).  I also include a “most recent” period calculation, which occurs if we measure the OHC after the bulk of the ARGO deployment took place.  For LJ14 they present this value from 2004-2011.  For L12, they don’t present annual estimates until 2005, so I’ve used 2005-2013.  For B13, I simply used 2004-2010.  Here are the results:



ΔF (W/m^2)

CW13 ΔT (K)

LJ14 OHU (W/m^2)

LJ14 ΔQ (W/m^2)

LJ14 ECS (K)

B13 OHU (W/m^2)

B13 ΔQ (W/m^2)

B13 ECS(K)

L12 OHU (W/m^2)

L12 ΔQ (W/m^2)

L12 ECS (K)





























































Post_ARGO(2004 or 2005)-Base












As you can see, the bulk of the estimates still seem to suggest an ECS < 2K, in line with the Otto et al (2013) calculations.  For LJ14, the results are actually pretty tightly constrained (apart from the “low-ball” 1980s estimate) between 1.7 and 2.0 K.  Using ORAS-4 (B13) does increase the ECS estimate when using the 2000s only, but it also largely decreases when using the 1990s or 1970s.  When using only post-ARGO data, it is pretty much in line with the others, suggesting an ECS ~ 2K.  L12 is also fairly tightly constrained, apart from the “highball” estimate of the 1970s.  Overall,using ORAS-4 produces the largest error margins and interval-specific sensitivity. However, it is worth noting that using the post-ARGO OHC data seems to largely remove the dependence on which OHC dataset is chosen, producing best estimates between 1.7 and 2.1 for ECS. 

Data for this post available here.

February 28, 2014

Initial thoughts on the Schmidt et al. commentary in Nature

Filed under: Uncategorized — troyca @ 7:20 pm

Thanks to commenter Gpiton, who on my last post about attributing the pause, alerted me to the commentary by Schmidt et al (2014) in Nature (hereafter SST14).  In the commentary, the authors attempt to see how the properties of the CMIP5 ensemble and mean might change if updated forcings were used (by running a simpler model), and find that the results are closer to the observed temperatures (thus primarily attributing the temperature trend discrepancy to incorrect forcings).  Overall, I don’t think there is anything wrong with the general approach, as I did something very similar in my last post.  However, I do think that some of the assumptions about the cooler forcings in the "correction" are more favorable to the models than others might choose, and a conclusion could easily be misinterpreted.  This is my list of comments that, were I a reviewer, I would submit:

Comment #1: The sentence

We see no indication, however, that transient climate response is systematically overestimated in the CMIP5 climate models as has been speculated, or that decadal variability across the ensemble of models is systematically underestimated, although at least some individual models probably fall short in this respect.

is vague enough to perhaps be technically true while at the same time giving the (incorrect, IMO) impression they have found that models are correctly simulating TCR or decadal variability.  It may be technically true in that they find "no indication" of bias in TCR or internal variability, due to the residual “prediction uncertainty”, but this is one of those "absence of evidence is not evidence of absence" scenarios, where even if the models WERE biased high in their TCR there would be no indication by this definition.  By the description in the commentary,  the adjustments only remove about 60-65% of the discrepancy.  The rest of the discrepancy may be related to non-ENSO noise, but it also may be related to TCR bias, and would be what we expect to see if , for example, the "true" TCR was 1.3K (as in Otto et al., 2013) vs. the CMIP5 mean of 1.8K.  Obviously, the reference to Otto et al., (2013) might be mistaken by some to suggest an answer/refutation to that study (which used a longer period to diagnose TCR in order to reduce the "noise"), but clearly this would be wrong.  Had I been a reviewer, I would have suggested changing the wording: "The residual discrepancy may be consistent with an overestimate of the transient climate response in CMIP5 models [Otto et al., 2013] or an underestimate of decadal variability, but it is also consistent with internal noise unrelated to ENSO, and we thus cannot neither rule out nor confirm any of the explanations in this analysis."  Certainly has a different feel, but it essentially communicates the same information, and is much less likely to be misinterpreted by the reader.

Comment #2: Regarding the overall picture of updated forcings, it is worth pointing out that IPCC AR5 Chapter 9  [Box 9.2, p 770] describes a largely different opinion (ERF = ”effective radiative forcing”)

For the periods 1984–1998 and 1951–2011, the CMIP5 ensemble-mean ERF trend deviates from the AR5 best-estimate ERF trend
by only 0.01 W m–2 per decade (Box 9.2 Figure 1e, f). After 1998, however, some contributions to a decreasing ERF trend are missing
in the CMIP5 models, such as the increasing stratospheric aerosol loading after 2000 and the unusually low solar minimum in 2009.
Nonetheless, over 1998–2011 the CMIP5 ensemble-mean ERF trend is lower than the AR5 best-estimate ERF trend by 0.03 W m–2 per
decade (Box 9.2 Figure 1d). Furthermore, global mean AOD in the CMIP5 models shows little trend over 1998–2012, similar to the
observations (Figure 9.29). Although the forcing uncertainties are substantial, there are no apparent incorrect or missing global mean
forcings in the CMIP5 models over the last 15 years that could explain the model–observations difference during the warming hiatus.

(My emphasis).  Essentially, the authors of this chapter find a discrepancy of 1.5 * 0.03 = 0.045 W/m^2 over the hiatus, whereas SST14 use a discrepancy of around 0.3 W/m^2, which is nearly 7 times larger!  And there does not appear to have been new revelations about these forcings since the contributions to the report were locked down – the report references the "increasing stratospheric aerosol loading after 2000" and "unusually low solar minimum in 2009" mentioned in the  commentary.  Regarding the anthropogenic aerosols, both the Shindell et al, (2013) and  Bellouin et al., (2011) papers referenced by SST14 for the nitrate and indirect aerosol estimates are also referenced in that AR5 chapter, and Shindell was a contributing author to the chapter.  This is not to say that the IPCC is necessarily right in this matter, but it does suggest that not everyone agrees with the magnitude of the forcing difference used in SST14.

Comment #3:  Regarding the solar forcing update, per box 1, SST14 note: "We multiplied the difference in total solar irradiance forcing by an estimated factor of 2, based on a preliminary analysis of solar-only transient simulations, to account for the increased response over a basic energy balance calculation when whole-atmosphere chemistry mechanisms are included."

I would certainly want to see more justification for doubling the solar forcing discrepancy (this choice alone accounts for about 15% of the 1998-2012 forcing discrepancy used)!  If I understand correctly, they are saying that they found that the transient response to the solar forcing is approximately double the response to other forcings in their preliminary analysis.  But this higher sensitivity to a solar forcing would seem to be an interesting result in its own right, and I would want to know more about this analysis, and what simulations were used – was this observed in one, some, most, or all of the CMIP5 models?  After all, if adjusting the CMIP5 model *mean*, it would be important to know that this was a general property shared across most of the CMIP5 models.

Comment #4: For anthropogenic tropospheric aerosols, two adjustments are made.  One for the nitrate aerosol forcing, and the second for the aerosol indirect effect.  SST14 notes that only two models include the nitrate aerosol forcing, whereas half contain the aerosol indirect effect, and so the ensemble and mean are adjusted for these.  But it is not clear to me if the individual runs for each of the CMIP5 models are adjusted (and thereby no adjustments are made to runs from models that include the effect), and the mean recalculated, or if simply the mean is adjusted.  The line "…if the ensemble mean were adjusted using results from a simple impulse-response model with our updated information on external drivers" makes me think the latter.  But if it is this latter case, clearly this is incorrect – if half of the models already include the effect, you would be overcorrecting by a factor of 2 if you adjusted the mean by this full amount (perhaps the -0.06 is halved before the adjustment is actually made, but it is not specified).

Comment #5: Regarding the indirect aerosol forcing, Belloin et al., (2011) is used as the reference, which uses the HadGEM2-ES model.  It is worth noting the caveats:

The first indirect forcing in HadGEM2‐ES, which can be diagnosed to the first order as the difference between total and direct forcing [Jones et al., 2001] might
overestimate that effect by considering aerosols as externally mixed, whereas aerosols are internally mixed to some extent.  By comparing with satellite retrievals, Quaas et al. [2009] suggest that HadGEM2 is among the climate models that overestimate the increase in cloud droplet number concentration  with aerosol optical depth and would therefore simulate too strong a first indirect effect.

Moreover, I am not quite sure the origin of the -0.06 W/m^2.  Belloin et al., (2011) suggest an indirect effect from nitrates that is ~40% the strength of the direct effect.  So if only nitrate aerosols increased over the post-2000 period, I would expect an indirect effect of ~ -0.01 W/m^2.  It seems to me that this must include the much-larger effect of sulfate aerosols, which leads me to my next comment…

Comment #6: Sulfate Aerosols.  Currently, sulfate aerosols constitute a much larger portion of the aerosol forcing than do other species (nitrates in particular).  I presume that for #5, the indirect aerosol forcing of that magnitude would need to result from an increase in sulfur dioxide emissions.  But as per the reference of Klimont et al., (2013) in my last post, global sulfur dioxide emissions have been on the decline since 1990, and since 2005 (when the CMIP5 RCP forcings start) the Chinese emissions have been on the decline as well (only India continues to increase).  Rather than the lack of indirect forcing artificially warming the models relative to observations, it seems like it has been creating a *cooling* bias over this period, if you use the simple relationship between emissions and forcing as in Smith and Bond (2014).  In fact, since 2005 (according to Klimont et al., 2013 again), sulfur dioxide emissions have declined faster than in 3 of the 4 RCP scenarios.  It seems likely to me that the decline in sulfur dioxide emissions over this period (and it’s corresponding indirect effect) would more than counteract the tiny bias from the NO2 emissions. 


Having just done a similar analysis, I thought it important to put the Schmidt et al. (2014) Nature commentary in context.  There is enough uncertainty around the actual forcing progression during the "hiatus" to find a set of values that attribute most of the CMIP5 modeled / observed temperatures to forcing differences.  However, the values chosen by SST14 do seem to represent the high end of this forcing discrepancy, and it appears that most of authors of AR5 chapter 9 believe the forcing discrepancy to be much more muted.  Moreover, the SST14 commentary should not be taken to be a response to longer period, more direct estimates of TCR, such as that of Otto et al., (2013).  Specifically, the TCR bias found in that study would be perfectly consistent with the remaining discrepancy and uncertainty present between the CMIP5 models and observations.    

February 21, 2014

Breaking down the discrepancy between modeled and observed temperatures during the “hiatus”

Filed under: Uncategorized — troyca @ 9:35 pm



There are many factors that have been proposed to explain the discrepancy between observed surface air temperatures and model projections during the hiatus/pause/slowdown. One slightly humorous result is that many of the explanations used by the “Anything but CO2” (ABC) group – which argues that previous warming is caused by anything (such as solar activity, the PDO, or errors in observed temperatures) besides CO2 – are now used by the “Anything but Sensitivity” (ABS) group, which seems to argue that the difference between modeled and actual temperatures may be due to anything besides oversensitivity in CMIP5 models. And while many of these explanations likely have merit, I have not yet seen somebody try to quantify all of the various contributions together. In this post I attempt (perhaps too ambitiously) to quantify likely contributions from coverage bias in observed temperatures, El Nino Southern Oscillation (ENSO), post-2005 forcing discrepancies (volcanic, solar, anthropogenic aerosols and CO2), the Pacific Decadal Oscillation (PDO), and finally the implications for transient climate sensitivity.

Since the start of the “hiatus” is not well defined, I will consider 4 different start years, all ending in 2013. 1998 is often used because of the large El Nino that year, which minimizes the magnitude of the trend starting in that year.  On the other hand, the start of the 21st century is sometimes considered as well. Moreover, I will use HadCRUTv4 as the temperature record (since this is the more cited to represent the hiatus), which will show a larger discrepancy at the beginning than GISS, but will also show a larger influence from the coverage bias. The general approach here is to consider that IF the CMIP5 multi-model mean (for RCP4.5) is unbiased, what percentage of the discrepancy can we attribute to the various factors? Only at the end do we look into how the model sensitivity may need to be “adjusted”. Note that each of the steps below are cumulative, building off of previous adjustments. Given that, here is the discrepancy we start with:

Start year

HadCRUT4 (K/Century)

RCP4.5 MMM (K/Century)














Code and Data

My script and data for this post can all be downloaded in the zip package here. Regarding the source of all data:

· Source of “raw” temperature data is HadCRUTv4

· Coverage-bias adjusted temperature data is from Cowtan and Way (2013) hybrid with UAH

· CMIP5 multi-model mean for RCP4.5 comes from Climate Explorer

· Multivariate ENSO index (MEI) comes from NOAA by way of Climate Explorer

· Total Solar Irradiance (TSI) reconstruction comes from SORCE

· Stratospheric Aerosol Optical Thickness comes from Sato et al., (1993) by way of GISS

· CMIP5 multi-model mean for natural only comes from my previous survey

· PDO index comes from the JISAO at the University of Washington by way of Climate Explorer.


Step 1: Coverage Bias

For the first step, we represent the contribution from coverage bias using the results from Cowtan and Way (2013). This is one of two options, with the other being to mask the output from models and compare it to HadCRUT4. The drawback of using CW13 is that we are potentially introducing spurious warming by extrapolating temperatures over the Arctic. The drawback of masking, however, is that if indeed the Arctic is warming faster in reality than it is in the multi-model-mean, then we are missing that contribution. Ultimately, I chose to use CW13 in this post because it is a bit easier, and because it likely represents an upper bound on the “coverage bias” contribution. I may examine the implications of using masked output in a future post.


The above graph is baselined over 1979-1997 (prior to the start of the hiatus), which highlights the discrepancy that occurs during the hiatus.

Start year

S1: HadCRUT4 Coverage Adj (CW13, K/Century)

RCP4.5 MMM (K/Century)













Step 2: ENSO Adjustment

The ENSO adjustment here is simply done using multiple linear regressions, similar to Lean and Rind (2008) or Foster and Rahmstorf (2011), except using the exponential decay fit for other forcings, as described here.  While I have noted several problems with the LR08 and FR11 approach with respect to solar and volcanic attribution, which I mention in the next step, I also found that ENSO variations are high enough frequency so as to be generally unaffected by other limitations in the structural fit of the regression model.



Step 3: Volcanic and Solar Forcing Updates for Multi-Model-Mean

The next step in this process is a bit more challenging. We want to see to what degree updated solar and volcanic forcings would have decreased the multi-model mean trend over the hiatus period, but it is quite a task to have all the groups re-run their CMIP5 models with these updated forcings. Moreover, as I mentioned above and in previous posts (and my discussion paper), simply using linear regressions does not adequately capture the influence of solar and volcanic forcings. Instead, here I use a two-layer model (from Geoffrey et al., 2013) to serve as an emulator for the multi-model mean, fitting it to the mean of those natural-only forcing runs over the period. This is a sample of the “best fit”, which seems to adequately capture the fast response at least, even if it may be unable to capture the response over longer periods (but we only care about the updates from 2005-2013):



And here are the updates to the solar and volcanic forcings (updates in red). For the volcanic forcing, we have CMIP5 volcanic aerosols returning to background levels after 2005. For the solar forcing, we have CMIP5 using a naïve, recurring 11-year solar cycle, as shown here, after 2005.


The multi-model mean is then “adjusted” by the difference between our emulated volcanic and solar temperatures from the CMIP5 forcings and the observed forcings. The result is seen below:


Over the hiatus period, the effect of the updated solar and volcanic forcings reduces the multi-model mean trend by between 13% and 20%, depending on the start year.


Updated anthropogenic forcings?

With regards to the question of how updated greenhouse gas and aerosols forcings may have contributed to the discrepancy over the hiatus period, it is not easy to get an exact number, but based on evidence of concentrations and emissions that I’ve seen, there does not seem to be a significant deviation between the RCP4.5 scenario from 2005-2013 and what we’ve observed. This is unsurprising, as the projected trajectories for all of the RCP scenarios (2.6, 4.5, 6.0, 8.5) don’t substantially deviate until after this period.

For instance, the RCP4.5 scenario assumes the CO2 concentration goes from 376.8 ppm in 2004 to 395.6 ppm in 2013. Meanwhile, the measured annual CO2 concentration has gone from 377.5 ppm in 2004 to 396.5 ppm in 2013. By my back-of-the-envelopment calculation, this means we have actually experienced an increase in forcing of 0.002 W/m^2 more than in the RCP4.5 scenario, which is a magnitude away from being relevant here.

For aerosols, Murphy (2013) suggests little change in forcing from 2000-2012, the bulk of the hiatus period examined. Klimont et al (2013) find a reduction in global (and Chinese) sulfur dioxide emissions since 2005, compared to the steady emission used in RCP4.5 from 2005-2013, meaning that updating this forcing would actually increase the discrepancy between the MMM and observed temperatures. However, it seems safer to simply assume that mismatches between projected and actual greenhouse gas and aerosol emissions have contributed a likely maximum of 0% to the observed discrepancy over the hiatus, and it is quite possible that they have contributed a negative amount (that is, using the observed forcing would increase the discrepancy).


Step 4 & 5: PDO Influence and TCR Adjustment

Trying to tease out the “natural variability” influence on the hiatus is quite challenging. However, most work seems to point to the variability in the Pacific: Trenberth and Fasullo (2013) suggest the switch to the negative phase of the PDO is responsible, causing changing surface wind patterns and sequestering more heat in the deep ocean. Matthew England presents a similar argument, tying in his recent study to that of Kosaka and Xie (2013) over at Real Climate.

In general, the idea is that the phase of the PDO affects the rate of surface warming. If we assume that the PDO index properly captures the state of the PDO, and that the rate of warming is proportional to the PDO index (after some lag), we should be able to integrate the PDO index to capture the form of the influence on global temperatures. Unfortunately, because of the low frequency of this oscillation, significant aliasing may occur between the PDO and anthropogenic component if we regress this form directly against observed temperatures.

There are thus two approaches I took here. First, we can regress the remaining difference between the MMM adjusted for updated forcings and the ENSO-adjusted CW13, which should indicate how much of this residual discrepancy can be explained by the PDO. In this test, the result of the regression was insignificant – the coefficient was in the “wrong” direction (implying that the negative phase produced warming), and R^2=0.04. This is because, as Trenberth et al. (2013) note, the positive phase was in full force from 1975-1998, contributing to surface warming. But the MMM matches too well the rate of observed surface warming from 1979-1998, leaving no room for the natural contribution from the PDO.

To me, it seems that if you are going to leave room for the PDO to explain a portion of the recent hiatus, it means that models probably overestimated the anthropogenic component of the warming during that previous positive phase of the PDO. Thus, for my second approach, I again use the ENSO-adjusted CW13 as my dependent variable in the regression, but in addition to using the integrated PDOI as one explanatory variable, I include the adjusted MMM temperatures as a second variable. This will thus find the best “scaling” of the MMM temperature along with the coefficient for the PDO.

After using this method, we indeed find the “correct” direction for the influence of the PDO:


According to this regression, the warm phase of the PDO contributed about 0.1 K to the warming from 1979-2000, or about 1/3 of the warming over that period. Since shifting to the cool phase at the turn of the 21st century, it has contributed about 0.04 K cooling to the “hiatus”. This suggests a somewhat smaller influence than England et al. (2014) finds.

For the MMM coefficient, we get a value of 0.73. This would imply that the transient climate sensitivity is biased 37% too high in the multi-model mean. Since the average transient climate sensitivity for CMIP5 is 1.8 K, this coefficient suggests that the TCR should be “adjusted” to 1.3 K. This value corresponds to those found in other observationally-based estimates, most notably Otto et al. (2013).

When we put everything together, and perform the “TCR Adjustment” to the CMIP5 multi-model-mean as well, we get the following result:




Using the above methodology, the table below shows the estimated contribution by each factor to the modeled vs. observational temperature discrepancy during the hiatus (note that these rows don’t necessarily add up to 100% since the end result is not a perfect match):

Start Year

Step1: Coverage Bias

Step2: ENSO

Step3: Volc+Solar Forcings

Step4: PDO (surface winds & ocean uptake)

Step5: TCR Bias

























According to this method, the coverage bias is responsible for the greatest discrepancy over this period. This is likely contingent upon using Cowtan and Way (2013) rather than simply masking the CMIP5 MMM output (and using HadCRUT4 rather than GISS). Moreover, 65% – 79% of the temperature discrepancy between models and observations during the hiatus may be attributed to something other than a bias in model sensitivity. Nonetheless, this residual warm bias in the multi-model mean does seem to exist, such that the new best estimate for TCR should be closer to 1.3K.

Obviously, there are a number of uncertainties regarding this analysis, and many of these uncertainties may be compounded at each step. Regardless, it seems pretty clear that while the hiatus does not mean that surface warming from greenhouse gases has ceased – given the other factors that may be counteracting such warming in the observed surface temperatures – there is still likely some warm bias in the CMIP5 modeled TCR contributing to the discrepancy.

October 17, 2013

How well do the IPCC’s statements about the 2°C target for RCP4.5 and RCP6.0 scenarios reflect the evidence?

Filed under: Uncategorized — troyca @ 7:57 pm

1. Introduction

In the IPCC AR5 summary for policy makers (SPM), there are few statements that are likely to garner more attention than those related to projected warming for this century under various scenarios.  In particular, given the prominence placed on the 2 degrees Celsius target, I would argue that Section E.1 is of great importance for policy makers. In the top box, we read:

Global surface temperature change for the end of the 21st century is likely to exceed 1.5°C relative to 1850 to 1900 for all RCP scenarios except RCP2.6. It is likely to exceed 2°C for RCP6.0 and RCP8.5, and more likely than not to exceed 2°C for RCP4.5.

(My bold).  Note that RCP4.5 involves a continual increase of global CO2 emissions up until ~2040, whereas RCP6.0 shows a large increase in emissions until ~2060 (see below, from Figure 6 of van Vuuren et al., 2011)  It is obviously of interest to know the likelihood of staying under the 2°C since pre-industrial target this century without reducing global emissions (note that reducing emissions is not the same as reducing the rate of emissions increase) for another 30 – 50 years.


Figure 6, van Vuuren et al., 2011

The statement is repeated in a bullet point below E.1, along with the reference to where we can find more information in the heart of the report:

Relative to the average from year 1850 to 1900, global surface temperature change by the
end of the 21st century is projected to likely exceed 1.5°C for RCP4.5, RCP6.0 and RCP8.5
(high confidence). Warming is likely to exceed 2°C for RCP6.0 and RCP8.5 (high confidence),
more likely than not to exceed 2°C for RCP4.5 (high confidence), but unlikely to exceed 2°C
for RCP2.6 (medium confidence). Warming is unlikely to exceed 4°C for RCP2.6, RCP4.5 and
RCP6.0 (high confidence) and is about as likely as not to exceed 4°C for RCP8.5 (medium
confidence). {12.4}

My Bold.  Based on my reading,  I think the current state of evidence makes it difficult to agree with the qualitative expressions of probability given for RCP4.5 ("more likely than not") and RCP6.0 ("likely") regarding the 2 degrees target, as well as the "high confidence" given, which I will explain in more depth below.

My interest in this was sparked recently when using a two-layer model with prescribed TCR and effective sensitivities to trace the warming up to the year 2100.  Somewhat surprisingly, I found that for many realistic TCR scenarios, the simulated earth warmed less than 2°C  by the end of the century.  I began to do a scan of the recently released AR5 to find the justification for the scenarios mentioned above.

2. Probabilistic Statements and Confidence

First, let’s discuss what the SPM means by "more likely than not to exceed 2°C for RCP4.5 (high confidence)".  Based on my reading of the Uncertainty Guidance, I believe this should mean there is more than a 50% chance of reaching 2°C by the end of the century ("more likely than not"), and that there is plenty of evidence that is in widespread agreement ("high confidence") about this probability.  Regarding the statement about RCP6.0, "Warming is likely to exceed 2°C for RCP6.0 and RCP8.5 (high confidence)", the "likely" refers to a greater than 66% probability.

Anyhow, since SPM points us to Chapter 12 (and section 12.4 in particular) that’s where I’ll start.  From section

The percentage calculations for the long-term projections in Table 12.3 are based solely on the CMIP5 ensemble, using one ensemble member for each model. For these long-term projections, the 5–95% ranges of the CMIP5 model ensemble are considered the likely range, an assessment based on the fact that the 5–95% range of CMIP5 models’ TCR coincides with the assessed likely range of the TCR (see Section below and Box 12.2). Based on this assessment, global mean temperatures averaged in the period 2081–2100 are projected to likely exceed 1.5°C above preindustrial for RCP4.5, RCP6.0 and RCP8.5 (high confidence). They are also likely to exceed 2°C above preindustrial for RCP6.0 and RCP8.5 (high confidence).

This seems to suggest that for the long-term projections (that is, the warming expected by the end of the century), this is based solely on the CMIP5 model runs.  The observational assessments of TCR (transient climate response) only come into play in so much as that “likely” range approximately matches the 5%-95% range of CMIP5 models.  Notice that the statement of greater than 2°C being “more likely than not” for RCP4.5 is absent right here, despite being present in the relevant portion of the SPM.  So how does that statement find justification in the SPM, and how does it have "high confidence"?

3. On the “More Likely Than Not / High Confidence” RCP4.5 Statement

My impression is that this statement arises based on table 12.3, where 79% of the models produce a warming of more than 2°C under the RCP4.5 scenario.  The high confidence presumably comes from the agreement between the assessed likely  range of TCR estimates and the 5-95% range of TCR in models, but the problem with this becomes obvious when looking at box 12.2, figure 2:


Box 12.2, Figure 2

Note that while the assessed grey ranges roughly match, the actual distributions are largely different.  The CMIP5 models have a mode for TCR > 2 (with a mean of 1.8, per chapter 9), while most of the AR5 estimates show a mode to the left of it.  In other words, just because there is "high confidence" in the 5% and 95% boundaries for CMIP5 projections, this does NOT give a legitimate basis for translating it into confidence about more specific aspects of the CMIP5 projected temperature rise distributions, particularly the "most likely" values (implied by the "more likely than not" statement).  Moreover, our best current evidence suggests the average of CMIP5 models is running too hot (as seen below), so one must be especially careful about making such specific statements based AOGCM results.

In section, the report alludes to the higher CMIP5 transient response issue:

A few recent studies indicate that some of the models with the strongest transient climate response might overestimate the near term warming (Otto et al., 2013; Stott et al., 2013) (see Sections 10.8.1,, but there is little evidence of whether and how much that affects the long term warming response.

This last statement is quite curious.  After all, the report claimed above that the rough matching of the range of TCR estimates with the 5-95% range of CMIP5 TCRs increased confidence in CMIP5 projections of long-term warming, but here the discrepancy in TCRs between estimates and models is dismissed due to lack of evidence of how it affects long-term warming? This seems hard to reconcile with Box 12.2, which notes:

For scenarios of increasing radiative forcing, TCR is a more informative indicator of future climate than ECS
(Frame et al., 2005; Held et al., 2010).

Indeed, this relative importance of TCR for end of century warming is one of the things I talked about in my last post Further investigation using that 2-layer model (script here) produces the following chart of TCR vs. 2081-2100 temperature above pre-industrial: 


While the model is simplified and only uses one mean forcing series (from Forster et al., 2013), it indicates that a TCR less than 1.6 K likely indicates less than 2 degrees of warming by the end of the century for the RCP4.5 scenario.  Again, going back to table 12.3, we see that 79% of the models produce a warming of more than 2°C under the RCP4.5 scenario.  However, by my count, only 8 of the 31 models (26%) have a TCR of less than 1.6K, so this matches up decently with expectations based on TCR (although the picture isn’t quite that pretty, as 2 have TCRs of 1.6K).  Nonetheless, I would suggest that if the true TCR is less than 1.6K, the RCP4.5 scenario is unlikely to produce a warming of more than 2°C by the end of this century (relative to pre-industrial). 

Thus, our question comes down to this – given the best possible evidence, what is the probability that TCR is < 1.6K?  I would suggest it is "more likely than not".  First of all, it appears that the bulk of the AR5 estimates that include a pdf show a most likely value <= 1.6K.  This is despite the fact that (I believe) only Otto et al. include the lesser impact of aerosols assessed in AR5 in their estimate, which would further reduce the estimated likely values for TCR.  Second, it is generally accepted now that the multi-model mean, with its 1.8K TCR, is running on the warm side.  Either way, given the discrepancy between most-likely values in the various estimates, I would downgrade the confidence.

My rewrite of the SPM for this part: "Relative to the average from year 1850 to 1900, global surface temperature change by the end of the 21st century will more likely than not stay below 2°C for RCP4.5 (medium confidence)."    

4. On the “Likely” / “High Confidence” RCP6.0 Statement

Things start out a bit confusing for the "likely"  greater than  2°C statement for RCP6.0, per the following comment under the chapter 12 executive summary notes:

Under the assumptions of the concentration-driven RCPs, global-mean surface  temperatures for 2081–2100, relative to 1986–2005 will likely
be in the 5–95% range of the CMIP5 models

(My italics)  I say that this is somewhat confusing because a 5%-95% range – according to the chapter  – is associated with "very likely", and not simply "likely".  However, we must distinguish between the range of model outcomes and that of real world possibilities, which the authors appear to do as well.  Given that the assessed "likely" range of TCR is approximately as wide as the 5%-95% TCR range of the CMIP5 models, it is clear that some sort of probabilistic downgrade is required (that is, using only 1 standard deviation of CMIP5 model TCR does not properly capture the whole "likely "range of real-world TCRs).  So the authors assess that the "very likely" range of CMIP5 models is only expected to capture the "likely" range of real-world possibilities (Sect again):

The likely ranges for 2046–2065 do not take into account the possible influence of factors that lead to near-term (2016–2035) projections of GMST that are somewhat cooler than the 5–95% model ranges (see Section 11.3.6), because the influence of these factors on longer term projections cannot be quantified. A few recent studies indicate that some of the models with the strongest transient climate response might overestimate the near term warming (Otto et al., 2013; Stott et al., 2013) (see Sections 10.8.1,, but there is little evidence of whether and how much that affects the long term warming response. One perturbed physics ensemble 
combined with observations indicates warming that exceeds the AR4 at the top end but used a relatively short time period of warming (50 years) to constrain the models’ projections (Rowlands et al., 2012) (see Sections and Global-mean surface temperatures for 2081–2100 (relative to 1986–
2005) for the CO2 concentration driven RCPs is therefore assessed to likely fall in the range 0.3°C–1.7°C (RCP2.6), 1.1°C–2.6°C (RCP4.5), 1.4°C–3.1°C (RCP6.0), and 2.6°C–4.8°C (RCP8.5) estimated from CMIP5

My bold.   (Note that the chapter indicates 0.6°C as the difference between pre-industrial and 1986-2005, so the RCP6.0 range is 2.0°C–3.7°C above pre-industrial, as confirmed in table 12.3). 
So, what are the problems here?

First, this seems awfully casual about the divergence between model projections and observed temperatures.  I understand that it may be unclear about how this relates to long-term projections (although several recent observational studies find a lower ECS than most model CMIP5 ECS’s as well), but this should not then translate into "high confidence" in the CMIP5 projections, particularly the lower end of that range.

Second, the assessed "likely" TCR range includes TCR values that would probably keep the RCP6.0 scenario below 2.0°C by the end of the century.  Note that the floor of the RCP6.0 range for CMIP5 is exactly 2.0°C above pre-industrial, which is presumably why the executive summary was able to say warming will "likely" be above 2.0°C for that scenario, but just barely.  Again, the confidence is "based on the fact that the 5–95%  range of CMIP5 models’ TCR coincides with the assessed likely range of the TCR".  But the ranges don’t match exactly, per Box 12.2 again:     

This assessment concludes with high confidence that the transient climate response (TCR) is likely in the range 1°C–2.5°C, close to the estimated 5–95% range of CMIP5 (1.2°C–2.4°C, see Table 9.5).

So the lower end of the "likely" range of TCR is 1.0°C (all evidence) rather than 1.2°C (CMIP5 only).  This would be a rather trivial difference, except that the lower floor for projected "likely" range using CMIP5 for RCP6.0 is exactly 2.0°C, so that lowering this 5% range to reflect all evidence – even if only from 1.2°C to 1.0°C – means probably lowering that projected warming above pre-industrial floor to around 1.7°C (1.0/1.2 * 2.0).  In other words, the "likely" range actually includes values below 2.0°C, and it would be difficult to say that that rise for RCP6.0 is "likely" to be above 2.0°C.***

Moreover, as you can see from my simple two-layer model tests below, a TCR <= 1.4K would mean we would probably see less than 2.0°C in this scenario:


Obviously, there is not much to suggest that the possibility of a TCR <= 1.4K is "unlikely", particularly when examining Box 12.2, Figure 2.  What’s more, the Otto et al. estimates, which are (I think) the only ones listed to use the AR5 aerosol estimates, indicate the "most likely" value is in this range!  So we are left in the rather awkward position that one of the more high-profile studies, using the most up-to-date data, suggests a "most likely" value for TCR that implies less than 2.0°C warming by the end of the century for RCP6.0, but the executive summary says that there is "high confidence" that greater than 2.0°C warming is "likely".

Third and finally, note that the "likely" ECS range in AR5 includes 1.5 K (Box 12.2, fig 1).  Now consider that the "effective forcing" for the RCP6.0 scenario above pre-industrial is 4.8 W/m^2.  This means, that at an ECS of 1.5K, you would only have 1.9°C of warming (4.8/3.7 * 1.5) at equilibrium (which can take several centuries to reach), regardless of the TCR.  Given the time delay to reach that equilibrium, it is probable that any ECS below 2K is unlikely to produce more than 2°C of warming by the end of the century in the RCP6.0 scenario.  Thus, the "likely" statement for more than 2°C in RCP6.0 is again questionable, even if based solely on the AR5 likely range of ECS estimates.

One more thing I want to address is the fact that in table 12.3, 100% of the RCP6.0 model runs produce a temperature rise greater than 2°C, despite 5 of the CMIP5 models having a TCR <= 1.4K.  From what I can tell, FGOALS-g2 (1.4K TCR) and INM-CM4 (1.3K TCR) didn’t participate in the RCP6.0 runs.  For the other three models (GFDL-ESM2G, GFDL-ESM2M, and NormESM1-M), I would have expected (based on their TCRs alone) to see less than 2°C of warming, and can only offer a few possible explanations: a) The ECS values of 2.4, 2.4, and 2.8 for these models are high relative to what one might expect for their low TCR values (failing to touch the lower end of the assessed "likely" range for ECS), and despite the greater importance of TCR, the higher ECS in this case might have pushed them just over the 2C mark, and/or b) these models may have produced an "effective forcing" for the RCP6.0 scenario greater than the 4.8 W/m^2 from the model ensemble.       

My rewrite of the SPM for this part: "Relative to the average from year 1850 to 1900, global surface temperature change by the end of the 21st century are as likely as not to exceed 2°C for RCP6.0 (low confidence)."    

***Note I say "difficult" rather than "impossible", because if the "likely" range include the middle 67% (approximately +/- 1 standard deviation), then excluding a low value from this likely range actually suggests about an 84% probability of it being greater than this value, not simply the 67% required to get to "likely".  However, more discussion / justification would certainly be required about performing a one-tailed test to deem a value "unlikely". 

5. Discussion

While I disagree with these particular statements in the SPM regarding the current best evidence, I find it hard to fault the authors, as I think much of the problem results from the IPCC process.  Essentially, with the projections based almost entirely on the CMIP5 models, and the IPCC unable to present any “novel” science that doesn’t appear published elsewhere (and hence produce new projections), it is hard to incorporate various other estimates of TCR and ECS into these projections.  Moreover, one is forced to consider most (or all) studies regardless of quality and even if using outdated data.  In fact, I think the authors made a wise decision to avoid using the 5%-95% range from CMIP5 as “very likely”, instead downgrading it as “likely” to reflect the spread of estimated TCRs and ECSs.  Unfortunately, this still has implications on the edges (as with the RCP 6.0 lower boundary of 2°C instead of 1.7°C) and center (trying to figure out a “most likely” value for RCP4.5).  Moreover, we have a rather awkward situations where “likely” values of TCR and ECS imply less than 2°C warming for RCP6.0, but we the SPM says that there is “high confidence” in a “likely” diagnosis for more than 2°C warming under RCP6.0.

Overall, I do not envy the job of the IPCC authors, and tend to agree that producing these massive reports for free is probably not the best use of anyone’s time.  It might be better to just create a “living” document such as a wiki, as others have suggested, although I can only imagine the struggles one would come up with in determining the rules for that.

6. Summary

In my opinion, the IPCC summary for policymaker’s overstates both the probability and the confidence in hitting the 2°C target by the end of the century for the RCP4.5 and RCP6.0 scenarios.  This is because:

  • The statements about the probability were based primarily upon CMIP5 projections
  • Confidence in these long-term projections was primarily justified by the 5%-95% range of CMIP5 transient climate responses (TCRs) matching up with the assessed likely range of TCRs from a variety of other sources, yet
  • Where the real-world transient responses began to diverge from models, this was not determined to decrease confidence in the long-term range of projections, and
  • Confidence in the 5%-95% range of CMIP5 TCRs was deemed to imply confidence in the “most likely” (50%) CMIP5 projections for RCP4.5, despite the AR5 “all-evidence” assessed TCR having a largely different distribution than that of the CMIP5 TCRs.
  • Despite the fact that several assessed “likely” values for TCR imply less than 2°C warming for RCP6.0, it was still determined that there was “high confidence” that greater than 2°C warming was “likely” for RCP6.0
  • Despite the fact that several assessed “likely” values for equilibrium sensitivity (ECS) imply less than 2°C warming for RCP6.0, it was still determined that there was “high confidence” that greater than 2°C warming was “likely” for RCP6.0

October 11, 2013

Relative importance of transient, effective, and equilibrium sensitivities

Filed under: Uncategorized — troyca @ 8:11 pm



I have heard it mentioned that the transient sensitivity (or transient climate response, TCR) is more relevant to climate policy than equilibrium climate sensitivity, and wanted to see the degree to which this is true.  AR4 notes that:

Since external forcing is likely to continue to increase through the coming century, TCR may be more relevant to determining near-term climate change than ECS.

Also, see this very interesting paper by Otto et al., (2013) ERL.  In particular, I wanted to see the relative importance of different types of sensitivity (transient, "effective", and equilibrium) when determining the temperature anomaly in the year 2100 relative to 1850.  For a recap of these types of sensitivities, let’s go back to our simple energy balance model:

C(dT/dt) = F – ΔT*λ

Here, C represents the heat capacity of the system, T represents the surface temperature, F represents the forcing, and λ the strength of radiative restoration.  The TCR refers to the change in temperature in the 1% CO2 increase per year idealized scenario at the time when the concentration doubles, which occurs at 70 years (1.01^70 = 2.0).  Given the large heat sink that is the ocean, there will still be a residual imbalance at the top of the atmosphere after this 70 years.  This means that TCR depends not only on λ, but also on the heat uptake of things like the ocean, which could be represented here in the C term (although note that C varies over time as heat is captured by deeper layers of the ocean, which is why a one-box model is not great for simulating short and long-term transient responses). 

For the equilibrium sensitivity, we also use the 2xCO2 scenario, but this time see how much the temperature changes after radiative equilibrium is reached at the top of atmosphere.  Now, in the idealized situation where λ is unchanging, we can calculate the temperature at equilibrium independent of the heat capacity of the system, as simply ΔT = F / λ.  It is this situation where we use a single-value of λ to calculate ΔT that the ΔT is referred to as "effective sensitivity".  Here, we’ll define λ_250 as the radiative restoration strength from 1850-2100, and ΔT = F_2xCO2 / λ_250 will be our "effective sensitivity".  This differs from ECS in that λ could theoretically change over subsequent centuries, as in Armour et al., 2012.  However, defined this way, ECS is irrelevant for this century’s warming, and such a discussion is more just academic (which is why the distinction between ECS and "effective sensitivity" has been minimized in the past). 

So, in this way, I will technically be comparing the importance of TCR vs. effective sensitivity (EFS), although the latter is sometimes used synonymously with ECS.



Here, I will be employing the same two-layer model as in my last post.  Since EFS is determined by λ, it is easy to prescribe in this model.  However, the TCR depends not only on the EFS, but the heat transfer of the ocean as well.  To match the TCR to a target for a given EFS, I fix the "shallow" and "deep" layers at 90m and 1000m respectively, and use a brute force method (ugly, I know) to determine the rate of heat transfer between these layers that will result in this TCR.  From there, I have a model that simulates both the specified TCR and EFS, which I can then use to see the temperature rise from 1850 to 2100 using this model.

I have chosen TCRs of 1.1K, 1.4K, 1.8K, and 2.1K, and EFSs of 1.5K through 4.5K, based on what is feasible/possible for a given TCR.  Again, the adjusted forcings for the RCP scenarios come from Forster et al. (2013). 

Script available here



This first graph is for the RCP6.0 scenario:


As you can see, the relatively flat lines (small slope) indicates that given a specific TCR, the EFS has only a small effect on the temperature that might be expected in 2100.  On the other hand, the large amount of space between the different TCR lines, despite only small changes in the magnitude of TCR, indicates that two models can have the same EFS but produce very different magnitudes of warming by 2100 if their TCRs differ. 

To quantify a bit more, the slope of the TCR-2.1 line is 0.16K per EFS, meaning that if the earth had a transient response of 2.1K, the difference between an EFS of 3.0 and an EFS of 4.0 would only produce an extra 0.16K in 2100K.  On the other hand, if we fixed the EFS at 3.0 and instead changed TCR, you can see that an increase between TCR of 1.4K and 2.1K (0.7K) produces a change of the same magnitude (~0.7K) in the expected temperature change by 2100, producing a ratio of 1.0 K per TCR.  A similar difference is found at 2.0 EFS when moving from 1.1K to 1.8K.  This would suggest that it is much more important to pin down the TCR (which has a large impact on this century’s warming), even if EFS remains uncertain, rather than trying to pin down EFS.  I should note, however, that for the low TCR scenario (1.1K), the slope of the line is ~0.34 K per EFS, which is double that at 2.1K.  This is because a low TCR suggests a lower heat capacity of the system, which in turn means a quicker pace to equilibrium, which means a greater importance of EFS. 

For another look, here is the RCP4.5 scenario:


Here, the results are similar to the RCP6.0, although not quite as drastic.  For a given EFS (2.0 or 3.0), the change in 2100 warming is altered by about 0.75 K per TCR (rather than 1.0 K per TCR in the 6.0 scenario).  Part of this is because we only have ~3/4 of the forcing change (4.5 W/m^2 rather than 6.0 W/m^2), so obviously the warming in general by 2100 is decreased.  However, this alone does not explain why the slopes for the TCR-1.1 and TCR-2.1 have increased to 0.4 K / EFS and 0.2 K / EFS respectively.  For that, I suggest that the earlier stabilization of the 4.5 scenario forcing lends a slightly increased importance to the EFS, given the ~30 years it has to move towards that equilibrium prior to 2100.



Clearly, TCR seems to be more important that EFS and (definitely more so than ECS) in determining the expected warming by 2100.  I tend to agree that it terms of policy, this suggests more focus should be placed on pinning down the TCR. 

September 2, 2013

How much more warming would we get if the world stopped emissions right now? Dependence on sensitivity and aerosol forcing.

Filed under: Uncategorized — troyca @ 11:34 am

The question of how much more warming is "in the pipeline" or we are "committed to", if we stopped emissions immediately, is an interesting but not quite trivial one to answer.  Sure, there are GCMs that can run this scenario, but given their seeming inability to correctly reproduce the effective sensitivity relevant on these timescales (Masters, 2013) or the magnitude of the aerosol forcing (as appears to be the case in AR5 drafts), it would be nice to see how these factors influence the amount of warming we could expect under this scenario. 

As part of a larger project, I have been developing some much simpler models that simulate the properties of the CMIP5 multi-model-mean under the various RCP scenarios up to the year 2100, but with the ability to enter specific values for parameters like effective sensitivity, current aerosol forcing, and emissions trajectories that have the largest impact on the warming.  I thought this scenario would be an interesting test for this portion of it (you might notice my R code in this case has the distinctive feel that it has been ported from a different language).

For the first part, we need to know how the temperature responds to forcings.  Obviously, one parameter we will take into account is "effective sensitivity", or rather its inverse in the form of radiative restoration strength – this is the value provided by the "user" or the model.  While the radiative restoration strength is not necessarily a constant, as discussed in previous posts, we want to take a value that is relevant on the century scale.  However, even if the user prescribes this value, we still need to know how fast the temperature will approach this value for a given forcing evolution.  For that part, I attempted to "reverse engineer" the values for the CMIP5 multi-model mean (which is itself a hodge-podge of various runs from various models), by using 1.18 W/m^2/K for radiative restoration (3.1K Effective Sensivity), as I found in my paper, and proceeded to fit the remaining parameters based on the adjusted forcings and temperature evolution of the 4 RCP scenarios as shown in Forster et al. (2013).  While a one-box model did not quite give a great fit across the various scenarios, using a 2-layer model as described in Geoffroy et al. (2013)  seems to do the trick.

For the second part, we need to know the actual forcing evolution.  This too requires some reverse engineering and simplifying assumptions.  In this case, since we are stopping emissions immediately, those species with atmospheric lifetimes << 1 year can see their forcings drop to 0 immediately (aerosols, O3).  For other species, I assumed a constant "effective atmospheric lifetime" and atmospheric fraction, again fitting these values based on the RCP scenarios.  This yielded an effective lifetime of 12 years for CH4 and 185 years for CO2.  Since N20 and the main contributing CFC have similarly long lifespans, I simply lumped these smaller effects in with CO2 by slightly increasing the CO2 forcing for a given concentration (again, I don’t want my "simple" model to require the user to enter 10 different emissions trajectories!). 

Here are the resulting forcing evolutions for the major greenhouse gases were emissions to stop today:



As you can see, due to the longer lifetime of CO2, the decrease in forcing is rather slow over time.  On the other hand, methane drops off rather quickly.  Contrary to the GHG forcing, we would likely see a large jump *up* in forcing from aerosols if we stopped emitting immediately…the magnitude of that jump up obviously depends on the magnitude of the current aerosol negative forcing.

Below are the temperature evolutions for different values of effective sensitivity and aerosol forcing.  For effective sensitivity, I will show the value of 1.8K (as found in my paper), as well as 3.1 K (as I found for the CMIP5 models).  For the aerosol forcing, I will use the values of -1.3 W/m^2 (AR4) and -0.9 W/m^2 (last draft of AR5).  Note that this additional warming chart is only taking into account anthropogenic forcings…it does not include the warming or cooling influences that may come from natural forcings in the future.  


As you can see, all of them have a "bump" within the next 10 years, which represents the immediate increase in forcing due to the drop-off of aerosols.  After that, the temperatures begin to drop as the methane concentrations and CO2 concentrations begin to decrease.  However, the magnitudes are very different!  In these scenarios, choosing different values of sensitivity and aerosol forcing (from what I consider realistically possible values), can mean the difference between 0.6 K and 0.2K warming in the pipeline.  To further highlight this dependence, I will show the more "extreme" values based on the edges of uncertainty of sensitivity and aerosols:


Code and data.

August 20, 2013

Points where I don’t find Andrew Dessler’s >2C ECS video to be convincing

Filed under: Uncategorized — troyca @ 7:53 pm

I saw this video at David Appell’s blog, as well as Eli Rabbett’s.  For full disclosure, I think that a < 2C ECS is as likely as not, as my recent paper suggests.  Furthermore, I previously had a paper published noting the sensitivity of a previous cloud feedback paper by Dr. Dessler’s to dataset choice.

While I think the beginning of video is a fair introduction to the concept of sensitivity and feedback, there basically seem to be 3 arguments that the video makes as to why sensitivity is unlikely to be < 2C, and I do not find them particularly convincing:

1) "From data, you can get that f is about 0.6"


Obviously, there is a lot of "data", and a lot of different methods to interpret this data, all which give different estimates for ECS (and hence f). In this case, the reference is to Dessler 2013, J Climate, which is not only just one dataset/method combination, but also one that is ill-suited for the purpose of determining ECS.  Note the following huge caveat of the D13 paper:

Second, the differences in the feedbacks between the control model runs and the A1B model runs stress that one should be careful in applying conclusions about the feedbacks derived from internal variability to longer-term climate change. [my bold]

Essentially, this caveat is necessary to pass review because when the method employed by D13 (using ENSO-induced flux variations to derive sensitivity) is applied to the models, it yields a "thermal damping rate" of -0.6 W/m^2/K, which corresponds to a sensitivity of 6.2 K.  Compare that to the more relevant long-term sensitivity calculated from the A1B ensemble of 2.93 K (thermal damping of -1.26 W/m^2/K), and it’s clear that the D13 paper shows that its own method overestimated the sensitivity by a factor of more than 2.  Essentially, if the method fails to adequately diagnose the ECS of the models when it is applied to that data, we’re not likely to have confidence that the method can diagnose the real-world sensitivity. 

Several other papers have noted the shortcomings in applying the ENSO-induced inter-annual fluctuations to calculating climate-scale feedbacks, and it is a topic I have spent a good amount of time on this blog discussing.  For example, Colman and Hanson (2012):

A comparison is also made of model feedbacks with reanalysis derived feedbacks for seasonal and interannual timescales. No strong relationships between individual modelled feedbacks at different timescales are evident: i.e., strong feedbacks in models at variability timescales do not in general predict strong climate change feedback, with the possible exception of seasonal timescales.  [D13 uses interannual timescales].

This method is essentially the same approach used  by Forster and Gregory (2006), which estimated an ECS of 1.6 K.  They obtained different values by using ERBE (rather than CERES) and a different time period, which should give an indication that the method is not particularly robust.  It is also quite sensitive to various methodological choices (using tropospheric rather than surface temps, using lead/lags), which can yield highly different results in sensitivity (as low as 0.6 K in the Lindzen and Choi, 2011case), but none of them seem to capture the ECS when applied to models, nor do any offer compelling arguments why these choices should make such a huge difference if indeed they are estimates of ECS.  Lest you think I am picking only on "higher sensitivity" results for this method, I have noted similar shortcomings up the Lindzen and Choi (2011) and more recently the Bjornbom discussion paper

A while back, I submitted a paper using this type of approach (using satellite fluxes and temperatures to estimate ECS), and several reviewers pointed out that there is little evidence or validation suggesting that these short-term variations in globally-averaged quantities give an indication of longer-term sensitivity.  After doing a good amount more research into the topic, I was inclined to agree.  I would think a similar point was made by reviewers of the D13 paper, and hence the strong caveat mentioned above.  What likely happens is that ENSO produces localized warming and feedbacks at different times during the evolution of a single phase, which means that when it is averaged together in a global quantity it says very little about a long-term response.  

Consider the hubbub that has been made about the difference between ECS and "effective sensitivity", the latter of which was calculated from 50-150 years of data, based on the evolution of spatial warming over time [Armour et al. (2012)].  If we are going to note the deficiency of a century for calculating ECS, it is difficult to see how the inter-annual response will reveal much about ECS.     

2) "…getting a much lower climate sensitivity, say below 2 degrees Celsius, would require a strongly negative cloud feedback."


I disagree!  A strong negative cloud feedback is not required to end up with a sensitivity of < 2K.  Generally, even a neutral cloud feedback will do the trick.  For instance, if you look at Soden and Held (2006), 7 of the 12 models for which cloud feedback is presented would have a < 2K sensitivity in the absence of a cloud feedback.  If the Planck response is ~ -3.2 W/m^2/K, the combined Water Vapor + Lapse Rate (better constrained than each individually) is ~ 1.0  W/m^2/K, and the surface albedo feedback ~ 0.3 W/m^2/K., this corresponds to a thermal damping of -1.9 W/m^2/K with no cloud feedback, or a sensitivity of 1.95 K.  Basically, if one was 50-50 on whether cloud feedback was positive or negative, I would say a < 2K sensitivity is more likely than not (albeit barely).  Even a slightly negative cloud feedback   (~ -0.3 K) would almost ensure a ECS < 2K. 

3) "…because we don’t know the forcing, I don’t look at the estimates of climate sensitivities from these studies [using 20th century observations] to be very meaningful."


The dismissal of essentially all recent estimates of ECS from investigation into the 20th-21st century climate based on uncertainty in forcing is highly dubious, in my opinion.  First of all, almost all of these studies explicitly take into account the uncertainties in forcing.  In fact, my paper shows that even in spite of the uncertainty in forcing, the method used gives a good indication of longer-term sensitivity as tested on the CMIP5 models.  These methods have the additional advantage of deriving climate-scale feedbacks, something lacking in the methodology of D13 (as admitted in the paper).  Moreover, updated estimates for aerosols suggest that previously these effects were over-estimated…so reworking my paper with smaller impacts would actually lower the estimate of ECS.

If we are dismissing methods based on uncertainty, there should be little made of paleo estimates, all of which have even greater uncertainty in the forcing, along will large uncertainties in the temperature changes.  The argument of "we have uncertainty, therefore we know nothing" is an oft-criticized argument (and I agree with the criticism), because everything has uncertainty.  But when this uncertainty is explicitly (and correctly) taken into account, there is no reason to discard the estimates.


Essentially, even if one weighed all the estimates using inter-annual flux variations with temperature changes (among which D13 is one estimate), it would be difficult to say that < 2.0 K ECS is "unlikely".  If one took into consideration the 20th-21st century estimates as well, it is near impossible to call it "unlikely".  Obviously, science is not democratic and some papers are better than others…but it is only by specifically focusing on the D13 results (using a method that seems to fail validation tests) while discarding the 20th century energy balance observations (a method which "passes" similar validation tests) that the video claims a < 2K sensitivity is unlikely.   

July 13, 2013

What does Balmaseda et al. 2013 say about the “missing heat”?

Filed under: Uncategorized — troyca @ 12:27 pm

There seems to be a good amount of confusion about what the "missing heat" refers to, as well as what implications the Balmaseda et al (2013) paper has for this missing heat.  So first, here is my quick summary:

In many ocean datasets, there seems to be a discrepancy in the rate of ocean heat uptake up to the mid 2000s, and the uptake afterwards, with this more recent rate of uptake appearing to be smaller.  However, neither models nor satellites show a decrease in the TOA imbalance (as the increased forcing should only exacerbate this imbalance).  The "missing heat" means that theoretically, there should be more heat going into the ocean (that is, at the same rate as before).  The other possibility is that there was "extra heat" before – that is, the discrepancy in the ocean heat uptake is an illusion, and that prior calculations were overestimating this heat uptake.  This latter assumption would imply that the GCMs are generally overestimating the TOA imbalance. 

Now let’s look at this within the context of Balmaseda et al., (2013):


Indeed, you can see that even in the purple line, the slope in the early part of the 2000s is larger than that of the later part of the decade.  After digitizing this and calculating the TOA imbalance, I get 1.23 W/m^2 for the first part of the decade (2000-2004), and 0.38 W/m^2 for the last half (2005-2009).  That is a huge discrepancy.  So let’s see if we see something similar in the satellites(CERES SSF1 degree net TOA imbalance annual anomaly):


A drop of ~ 0.85 W/m^2 should be quite obvious in this graph, but there is no sign of that at all.  Rather, the average imbalance from 2005-2009 is about 0.17 W/m^2 *larger* than that from 2000-2004.  If we called the combined discrepancy of ~ 1 W/m^2 imbalance for 5 years the "missing heat", we are talking somewhere around 8 * 10^22 J

Note that we are only using the satellites in this context in terms of *relative* TOA imbalance.  Some are under the misconception that we are able to measure the absolute TOA imbalance via satellite, and that using ocean heat content is just a secondary check (that is, we know that the extra heat is somewhere in the earth system from satellites, and just need to look harder to find it in the ocean).  This view is incorrect.  As explained in Stephens et al., 2012:

The combined uncertainty on the net TOA flux determined from CERES is ±4  Wm–2 (95% confidence) due largely to instrument calibration errors.  Thus the sum of current satellite-derived fluxes cannot determine the net TOA radiation imbalance with the accuracy needed to track such small imbalances associated with forced climate change

In other words, in terms of determining the absolute energy budget, ocean heat content is really the only game in town.  Satellites instead provide a check in terms of the evolution of this ocean heat content data, and have managed to raise a red flag in the ocean heat uptake slowdown.  In this much, I don’t see that Balmaseda et al. 2013 hasn’t really solved the case of the "missing heat", as we still see a large unexplained discrepancy in the rate of ocean heat uptake.  While surface winds and deep ocean heat warming may help explain a pause in surface warming, it does not explain this lower implied TOA imbalance. 

Loeb et al. 2012 essentially “solved” this problem by noting the large uncertainties in the ocean heat datasets, which implied that the apparent discrepancy was likely an artifact of inaccurate ocean measurements.  Given the vastly larger coverage of ARGO from 2005 on, and the fact that these estimates range from implied TOA imbalances of ~ 0.38 W/m^2 to 0.6 W/m^2 (von Shuckmann and Le Traon, 2011; Stephens et al, 2012; Hansen et al, 2011,  Masters 2013), and that we actually see a slight increase in TOA imbalance towards the later part of the CERES decade, I suspect that the huge rate of heating in the early part of the decade in Balmaseda et al. 2013 may be an artifact as well.  Such would imply that the B13 primary published ocean warming of 1.19 +/- 0.11 W/m^2 in the 2000s  (0.84 W/m^2 globally) may be 1.5x to 3x too high.

  In the next post, I hope to look at this last point in some more depth , and at "missing heat" in terms of TOA imbalance relative to various GCMs. 

 Data and code.

May 21, 2013

Another “reconstruction” of underlying temperatures from 1979-2012

Filed under: Uncategorized — troyca @ 7:56 am

Or, “could the multiple regression approach detect a recent pause in warming, part 4”.  For those following the series, you know what I mean by “underlying temperatures” is the temperature evolution if we attempted to remove the influence of solar, volcanic, and ENSO variations.  

It has been a while since I posted the first three parts of a series on whether using multiple linear regressions to remove the solar, volcanic, and ENSO effects from temperature was an accurate way to "reconstruct" the underlying trend. Generally, these did not perform too well, and tended to overestimate the solar influence and underestimate the volcanic influence, particularly if there was indeed a "slowdown" in the underlying temperature data.  One of the problems with that method is that it includes an assumption about the form of the underlying trend when doing the regressions.  

So, I’d thought I’d put a temperature series (actually, a couple of options) out there that have been adjusted for these factors, using a method that is not particularly sensitive to the form of the underlying trend.  Essentially, I take the multi-model mean of the models I used in the last post in this series to adjust for the volcanic and solar components, and then remove ENSO based on a regression against that adjusted series.  Fortunately, the ENSO variations are high enough frequency that the regression is not particularly sensitive to form of the the underlying trend (whether it be linear or quadratic) as we have limited the number of variables.

It should be noted that this method might *over-adjust* for volcanic and solar if the CMIP5 models are too sensitive, which my recent paper (Masters 2013, Climate Dynamics) seems to indicate.  I have therefore included an adjusted series that adjusts by only 50% of the MMM as well.  Since the difference between the sensitivies in the transient state are likely to be less than after equilibration, let’s say the "true" adjustment should lie somewhere in-between those two adjustments.

Anyhow, here is the reconstructed series of NCDC (NOAA) temperatures.  (On a side note, I have become a little annoyed with trying to grab data from HadCRUT4 and GISS.  The former seems to return a "Not Found" error quite frequently, and the latter doesn’t let the R default user-agent grab data at all.  Hence the usage of NOAA temperatures). 


If I were to go strictly by the eyeball test, the blue line (adjusted by 50% of MMM) seems to get it “most right” in terms of compensating for the volcanic eruptions without over-adjusting.  Below are the trends for the various start years ending in 2012 in these series:


Here you’ll note that the “adjusted” series actually results in a lower trend for all start years up until about 2001, when the influence of ENSO seems to really take over.  The blue line never dips below 0 for these adjusted trends of 10 years or longer, so one could argue that the underlying warming (if the blue line indeed captures this correctly) never really “stopped”.  On the other hand, the trends are substantially lower towards the end than they are at the beginning (and indeed smaller than in most model runs), so saying that the recent “slowdown” is simply the result of known natural factors rings a bit hollow to me.  It would be interesting to run a similar experiment on the CMIP5 model runs and see how much “natural” variation remains in those runs, of if this is something unique to the real world. 

Code and data for this post available here.

Older Posts »

The Silver is the New Black Theme Blog at


Get every new post delivered to your Inbox.