Will The Real Temperature Please Stand Up!

Many a moon ago (just last November) I finally decided to look at some temperature records myself. For many years due to what I learned at Naval Nuclear Power School I had some serious reservations about the “science” behind CAGW, but never looked. Well along came Climategate and all that and finally I went and taught myself how to use the spreadsheet program in Open Office and started looking. Well the first question was were to start looking and for a beginner to keep it simple. So I found the NASA GISS site and decided to get the data for State College Pa. The reason I picked that locale was fairly simple: It’s a University that maintains a station in the NASA Station list, and it being a University you could reasonably expect that it should have a good record that would require little QC on it. On top of that I knew the area due togrowing up not too far away. So I got the data, I looked up to see where the thermometers are located, I found that there had been a station move  (from a grassy area to the roof of a building) but nothing too major. So the GISS adjusted data shouldn’t have needed a lot of work right? Right? WRONG! Here is the graph of what I found when I compared the input data (“raw”) to the adjusted out put data:

So that left me scratching my head. How can you go from a cooling trend to a warming trend? I could see no reason for such a radical adjustment. Thats when I noticed that GISS doesn’t really use “raw” data. They use data from two different databases, one called GHCN and the the other USHCN. GHCN has a few US stations (including State College) and the data from stations from all over the world. USHCN has the data from over 1200 US stations (including State College). So I decided to take a look at GHCN “raw” and adjusted to see if either one looked like the input to GISS. Well I noticed a few things right off the bat. One there was data from before the GISS start date, but that isn’t a real big problem. Second there was no data after 2005 in GHCN for State College. Thats a problem because GISS has data for 2006 to 2008, so I assumed the data has to come from USHCN. So I went and got the USHCN data and looked. Well it has the same start date as the GISS data and it has the data from 2006 to 2008, but there was a problem here as well: missing data. From 1973 to 75 there is no data, same for the years 84, 91 and 2000. So I decided to plot them all together and this is what I got:

Argh not a one matches the GISS “raw” data completely. There is one more USHCN dataset for me to try but the thing is the file hasn’t been updated since June of 2009, where as the USHCN “raw” and TOB’s adjusted datasets are up to Dec 2009. The funny thing is GHCN and USHCN are both managed by the same agency: The National Climatic Data Center (NCDC) of the National Oceanic and Atmospheric Administration (NOAA). You would think that the data in one set would match the other, but they don’t. Then I found another database maintained by the NCDC that has PDF files of the original paper work turned in by the station (presumably) in those databases. I have started to transcribe them into the spreadsheet but it is slow going (got about 80 years worth of data to go) however here is how it compares to GHCN and USHCN “raw” for the years I do have in (Note some of the years are back at 1895 and others are at 2008 going in the other direction):

What you thought it would match one of the raw datasets? Get real this is climate science where things are murky.


3 responses to “Will The Real Temperature Please Stand Up!

  1. Pingback: Quote of the day! « TWAWKI

  2. E.M.Smith February 1, 2010 at 5:16 pm

    FWIW, the “GISS raw” from the GISS web site is after ‘STEP0’ of their code is run. STEP0 glues together GHCN and USHCN in a somewhat odd way. If it only has data from one of them, that gets passed through, If it has data from both, they are “adjusted” to get a smoother fit to each other…

    Details here:


    “The script then runs dump_old.f on the GHCN US-only data (to create a 1980 and newer subset for use in calculating an ‘offset’ between GHCN and USHCN), then dif.ushcn.ghcn.f (creates that odd ‘offset’ between GHCN and USHCN), cmb2.ushcn.v2.f (to ‘find the offset caused by adjustment’ in the original data and to remove it via the ‘offset’ from the prior step), then runs hohp_to_v2.f and cmb.hohenp.v2.f to add in a better version of the data for hohenpeissenberg (from input_files). It then creates the directories to_next_step and work_files, moves files into them, and finishes.

    That whole offset un-offsetting process just seems strange to me. Why not just use the non-adjusted data series OR use the already adjusted series? Why blend 1/3 of one with 1/3 of the other with 1/3 being unoffsetterized? This just looks like a great chance to introduce to the series that which is not in the original data sets. ”

    Then further down:

    “What dif.ushcn.ghcn.f does is to read in the files USHCN.v2.mean_noFIL (US with ‘3A’ removed) and ghcn_us_end (GHCN, only US, from 1980 to now) and produce the ushcn-ghcn_offset_noFIL file as output. It compares the USHCN record with the GHCN record (only back to 1980) looking for up to 10 years worth of common records.

    When it finds 10 years, or finds all there are less than 10 available since 1980, it takes the difference between “the monthly average temperatures of those years in GHCN and USHCN” and averages them together. If you only have one common year, you would compute DIFFERENCE= GHCN(year, eachmonth) – USHCN(year,eachmonth) or, for several years, it would be the average of DIFFERENCE for the first (up to 10) years counting backwards from the present to 1980. DIFFERENCE= GHCN(up.to.10.years,eachmonth) – USHCN(up.to.ten.years,eachmonth) summed over years into DIFF(eachmonth) for the 12 months of a year. This is the “offset” used later to change the actual temperature records into…, er, um, something else.

    The only reason for this seems to be to bring the two curves into alignment when you glue GHCN onto USHCN. I don’t see where this makes sense. One end or the other ought to be ‘more right’ or you are taking the risk of an error from gluing together 2 disjoint series and finding a bogus “trend”. This deserves exploration by someone more familiar with the data series. When applied, it is applied to all past records. I fail to see why an equipment change in 2001 ought to change the record from 1888, but maybe that’s just me… ”

    Oh, and as of 15 Nov 2009 they are using USHCN.v2 so the present “data” and processing may be different from this. That is, they moved the walnut shell so we have to see if this ‘pea’ is still here or not…

    Aren’t the “GIStemp Follies” fun? /sarcoff>

    Oh, and nice graphical presentation, BTW. It does very vividly indicate one of my first “What a minute, that’s looks like Bull.” moments. We’re supposed to panic over 1/10 C yet you get to choose data sets that have more than that variation between them (sometimes 10x more…) and all are ‘right’?…

  3. boballab February 1, 2010 at 6:20 pm

    @ 2
    Yeah it was finding your site and reading up on how GISTemp worked that led me to find out that GISS doesn’t have their own “raw” dataset and they mash together two other datasets. What really bugged me about it is that the so called GISS “raw” back in the late 1800’s and early 1900’s is higher then any type of average those two datasets can make together. How can two lower temps averaged together come up to a higher temperature then either input. The only thought I can come up with, and I haven’t checked it yet, is that they are looking at the difference between the GHCN and USHCN data and adding it to one or the other to get the GISS “raw”. This makes no sense to me but it seems the only mathematical method that on a glance looks like it fits.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: