November 1, 2010

Summary of US UHI tests with all datasets

Continuing the work from before (most recently here), I went ahead and modified the java code to handle a variety of new formats for different temperature datasets.  Here I’ll be basically performing the same tests I’ve done before, but will add in tests with a couple new datasets — GHCN v3 beta and Ron Broberg’s GSOD work.

Data Everywhere

My code and intermediate data available HERE (I manually extracted all US-stations from the global list using Excel).

Other relevant data: GHCNv2 , USHCNv2, GSODGHCNv3 (beta), GWPv3 (population data)

Changes to Calculation Method

Previously, I’ve been using (Pop2000-Pop1990) / (Pop2000) for the changing population X-axis value.  This was a rather boneheaded move because clearly this gives an inflated value for a decrease in population vs. an increase.  From here on out I’ll be using a more sensible calc of  (Pop2000-Pop1990) / (Pop2000+Pop1990)

Summary Table

As you can see, the “signal” appears far stronger in all of the HCN data sets than in the GSOD data set.  This could simply be because all of them share many of the same stations, and hence a fluke.  What’s also interesting is that the adjusted data sets all have a higher correlation and slope than their counterparts.  As I’ve suggested before, I don’t believe this is because the adjustments are adding in more UHI errors — rather, I think they are clarifying other errors, which only makes the UHI signal come in clearer.  This may explain the low correlation in the GSOD data set, which clearly has the least amount of QC.


-The graphs below will show a slope 10x more for GHCNv3 than for the other data sets, because they report in hundredths of a degree instead of tenths.

-For calculating yearly anomalies for the global data sets, I’ve once again required 12 months of reported data.


