Troy's Scratchpad

August 30, 2010

More UHI sniffing in GHCN

Filed under: Uncategorized — troyca @ 10:25 pm

Continuing on what I’ve been doing in Part I and Part II of my U.S. analysis in terms of UHI as a function of population, I’ll now look at the global temperature dataset.
The GHCN data can be retrieved from here:

I used the MEAN_ADJ data for all of my analysis.

Code and relevant data for this post can be found here:

The first attempt I made was to proceed almost exactly with GHCN as I had with USHCN. There are some subtle changes in the formatting that made me modify the algorithm slightly for the “global” format, the biggest being that I needed to calculate the yearly average from the monthly data rather than having it provided.  Also, some of the subsets below were manually processed in Excel (to remove US stations).

NOTE: the GHCN temps are reported in tenths of a degree Celsius, rather than Fahrenheit.With the first chart we get something like this (from dataset A):

Does this mean we have an independent confirmation at the global level of the phenomenon found in the United States?  Umm, no.  A closer look at the dataset reveals that out of the 771 stations included in this subset of “valid” station data, only 42 lie outside of the United States!  So of course we’re going to see something similar to the USHCN if most of these valid stations are the same.

I thus changed what data was considered “valid”.  My original requirement was as follows – I needed to have valid year averages available for 1990 and 2000, to use the endpoint temp comparison method.   I only got a “valid” year average if every month reported over the course of the year.  It may be amazing, but from that GHCN dataset, only those 42 stations seem to report for all twelve months in 1990 AND all twelve months in 2000.

So, I lowered the requirement to where I would only calculate the average for a year from months  1, 4, 7, and 10.  Not perfect, but this gives a sampling from all seasons.  It means that stations will only need to report for a specific 8 months instead of the whole 24 months in order to be considered valid.  This result is dataset B.
The number of non-US stations by about 30% to 55:

The next recourse was thus to relax the requirement for having data in both 1990 and 2000.  This eliminated the option of using the “end-point” method, so I needed to use the “OLS trend” method.  To ensure a fair sampling one of the requirements was that at least 4 years needed to report between 1990 and 2000 (still using the 4 month average for the year temp), rather than exactly 1990 and 2000 needing to report.

This is the resulting dataset (C) with only the non-US portion:

So do we have further proof of the UHI effect on the global scale?  Once again, unfortunately not.

The following is our “global” station breakdown by country in those 115 obs:






As you can see, the set is dominated by Korea, which greatly affects our resulting trend but in reality does not make up a large portion of the earth.

At this point, I could bemoan the lack of quality stations in the GHCN dataset from 1990 to 2000, but rumor has it some other citizen scientists have been working on another dataset.  I hope to make use of this in the near future to continue investigating this at the global level…

1 Comment »

  1. […] Troy_CA's blog Troy_CA's blog Skip to content HomeAbout ← More UHI sniffing in GHCN […]

    Pingback by Summary of US UHI tests with all datasets | Troy_CA's blog — November 1, 2010 @ 10:57 pm

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Create a free website or blog at

%d bloggers like this: