As I suggested in the original post documenting my experience, one of my goals was to reproduce the USHCNv2 F52 dataset using the Pairwise Homogeneity Algorithm included in the software folder. I e-mailed a few of my questions from that post to Dr. Menne, who helpfully answered them. I thought I’d share these answers in case anybody else is trying to do something similar:
1) The PHA is run separately on Tmax & Tmin series to produce the USHCNv2 dateset. We then compute Tavg as (Tmax+Tmin)/2.
2) Unfortunately, we never put the .HIS station history files out. These files contain 3 different
sources for station history information:
0 – the manual QC done by the original USHCNv1 team (static)
1 – information gleaned from the U.S. Cooperative Observer Summary of the Day dataset (not used)
2 – information extracted from the meta database (MMS) maintained at NCDC (in flux as updates and
corrections are added)
We will look into bundling the history files with the rest of the code package sometime in the next year.
3) Yes, the production run uses the parameters supplied by the *.incl
A couple of other notes:
a full reprocess of the USHCN monthly temperature data by the PHA has not been executed since 28 May 2008. Recent data are simply being appended to output of the May 2008 output. In 2011, there will be a new release of the USHCNv2 (F52g) concurrent with the release of GHCN Monthly Version 3 (currently available as a beta release).
as we indicated in the Menne et al. 2009 USHCNv2 overview article, we use all Cooperative Observer monthly temperature series to homogenize the USHCNv2 subset. This greatly expands the neighbor pool for USHCN sites. Therefore, you would need to run the PHA on the full Coop temperature database circa May 2008 in order to reproduce the results on our ftp site. We will also look into including this Coop database as part of the code release early next year.
What this tells me is that I may need to put my attempts at reproducing the F52 dataset on the back-burner for the time being, although hopefully in this upcoming year I’ll be able to pick it back up again.
Update (1/9): In addition to the comment below, I see that Ruhroh has queried RomanM regarding the process of separately adjusting max vs. min temperatures. As he is a statistician and I am not, you may want to read his response here.