This new iteration I’ve been working on in the background for a little while. Previously, the correlation between the increase in temperature over a region and the coincidental increase in population density, along with the limited amount of population data, created a few stumbling blocks.
Basically, as in before, I want to examine how the change in population affects the station temperatures, since this change could theoretically occur in both rural and urban stations. See my original post here for more background.
Intermediate data and code for this post can be downloaded here.
New Population Data
Previously, I’ve been using GWPv3 for my population density data. Unfortunately, this only included actual population data for the years 1990 and 2000 in the U.S. However, aggregate U.S. census data for every decade can be downloaded from the
Nation Historical Geographic Information System.
There is perhaps a better way to do it, but after downloading the population data by “place” from the NHGIS data finder, I matched it to the various U.S. stations from the USHCN database using a combination of Excel macros and by hand. (A sample macro for how I performed some of the mapping is included in the code above, as well as the population data by station for the years 1970, 1980, 1990, and 2000). So now we at least have 3 decades of station population data to work with.
The temperature data I’m using is once again USHCNv2
1) For each station, the linear temperature trend (dT) is calculated based on the data from 1970-2000. Similarly, we calculate the linear trend of the log of the population for the station (referred to as dP for simplicity) using the 1970, 1980, 1990, and 2000 population data. A station is only included IF at least 25 of its 31 years report annual temperature AND it has population for all 4 population years.
2) Based on latitude and longitude, we match all close station pairs. (I should mention that Ron suggested something similar way back in an earlier comment). For now I’ve matched based on “distance” in terms of degrees, which technically is not a valid physical distance, but should be close enough for this preliminary work. The station pairs that are included are based on this “threshold” distance…I’ll show how this threshold affects various results below. Obviously, the higher the threshold, the more station pairs we get.
3) We then graph these station pairs, plotting the difference in their linear temperature trends against the difference in their log population trends. We should see a positive correlation if we believe population to be a proxy for UHI and that this effect is discernible in the station records.
Results – USHCN raw
Results – USHCN TOB Adjusted
Results – USHCN F52 Adjustments
At first glimpse, this looks to me like there is indeed a UHI signal present based on population change. All of the tests above seem to show at least some significant correlation. This method seems to be fairly robust, but I’ve been wrong before.
On the other hand, it is hard to get a feel for the magnitude of this effect, since the slope varies wildly among each test. Furthermore, even if we DO pick one of the higher ends, an early glance suggests the relative effect would not be very large compared to the temperature trend in the U.S. during the period. I hope to do a more formal post investigating the magnitude of this effect in the future.
There are a number of other things I hope to do to help tease out the signal. The population data per station I still feel is suspect, so I’m looking for ways to improve the accuracy at the actual station location. Including more decades going back may also help eliminate some of the noise. But perhaps what I’m most excited about are the other variables available for download at NHGIS, some of which include land use, which may be a better proxy for determining the magnitude of the UHI effect than population.