Before Using Temperature Data Read The Fine Print
February 19, 2010Posted by on
Who ever thought that you would have to treat temperature datasets and graphs like they were credit card offers that come in the mail? You know where you get a really low introductory offer, but in the fine print on the back of the third page it tells you by accepting the terms you are locked into 3 years of 21% interest after the first 6 months. Well you do and here is an example that started with one of Steve Goddard’s posts over on WUWT.
Here is a reproduction of one of the graphs used in his piece:
Now where did that graph come from? Did Steve make it himself?
The answer to those questions is it came from the US Government, Steve did not make that graph nor any of the other graphs in his piece. They all came from the National Climatic Data Center’s (NCDC), United States Historical Climatology Network (USHCN) website.
You see NCDC provides a nice simple interface where you can get monthly station data for any station in the USHCN v2 dataset in a convenient Comma Separated Variable (CSV) file. This is similar to the setup that NASA GISS uses, however unlike GISS where when you select a station you see a graph first and then from there can get the data used to make it, the USHCN site has those functions separate.
See you don’t even need to get the data you can just plot the graph on the USHCN site and copy it, which is where you needed to read the fine print while working your way to this point. For you see part of the data plotted in Figure 1 is not real. That’s right the data from 2003 through 2008 is not data taken from a thermometer at the Temple, Tx station.
How do I know this?
Well lets work the steps I took to get here and I’ll provide the links so you can follow the footsteps.
A reader of the post on WUWT (John Slayton) had a question about that plot because the station in in Temple, Tx closed in 2003. How he knew this was because he took part in the Surface Stations project and went looking for the thing at the water treatment plant. See the photos and survey here:
So John was really curious where the data in the graph came from since there is no station there anymore. He asked in the WUWT thread but his question probably got lost in the chatter so he went over to Chiefio’s site and asked if he could find out if there was “infilling” occurring.
Now this is were I got involved. Chiefio has had a lot on his plate recently (see John Coleman and KUSI) and I have done a certain amount of research on the different databases. John asked about this and Chiefio answered in the context of GISTemp, but I knew that the info that Steve used didn’t come from GISS so I started looking through the USHCN site.
First I went to the USHCN site which is here:
First you notice that you can download the entire dataset from the FTP server or you can get individual station data from the web interface. I selected the web interface and that takes us to a page that has a Google Map and a drop down menu:
Now from there select Texas from the drop down menu and click the Map Site button. This gives you a list of stations with the USHCN station ID. Scroll down the list until you get to Temple Texas, ID # 418910. Click on it and you will find back up on the map a Quote bubble. In this quote bubble you get this some options such as Get Monthly or Daily data and Monthly and Daily Documentation.
From there when you select the Monthly data it takes you to a screen where you can get the CSV file or just plot the graph. What Steve did was just plot the graph and copied it:
Now from there I went and also got the CSV file for that station and in that file there was data up to 2008:
So I thought I will just read the documentation and there should be a simple explanation that a new thermometer was used and there should be an ID number of which station it was to extend the record.
So first stop is the station list for USHCN v2 which I so happen to have on my computer (I have the whole USHCN dataset stored) and this is what it says:
418910 31.0781 -97.3183 210.0 TX TEMPLE —— —— —— +6
Now what does that mean? We go to the documentation page:
and from there we get this:
Ok according to that there is no other station added to make the record longer. Now that leaves a big question mark. According to the station list there is a station with ID #418910 and it has no added stations to it. The data from the CSV file goes up to 2008 and so does the graph. On the other hand we know that the CO-OP station was closed in 2003, since that is what the plant manager told the survey team and they didn’t see either an old style Stevenson screen or an MMTS. So I went to the site where they have the paper copies from the CO-OP stations in PDF format accessible:
So there it is listed that the CO-OP station data ended in Sept of 2003.
So where is that data coming from? If they were using another station to extend the record you are suppose to see the ID # in the data fields of the station list.
So I thought about it for awhile and remembered something about the selection screen for making the CSV file. You see there is this option just under where you check you want the monthly temperature averages for that station. This option is called “The Mean Temperature Flag”. Now this doesn’t sound like much unless you read the fine print.
You see all the way back, even before you select the station you want, there is this little blurb:
Yep you need to go through the selection process and read the documentation file closely. You see in that link where they explained what each field in the station list meant there is something further down which is this:
Notice that they are talking about Flag Variables but don’t say where those flags are. Well it turns out, that is the “The Mean Temperature Flag” option, so if you check it along with the mean temp data option you get a little more info in the CSV file:
So after the data is now a letter flag and in the Flag 1 variable slot. We see that from 04/2003 all the way through 2008 there is the E flag which when you look above tells us that:
E = value is an estimate from surrounding values; no original value is available;
So basically from April 2003 through to the end of 2008 they made up the data for that station, but do they come straight out and tell you this? No.
If you did what Steve Goddard did and just plot the graph is there any inkling that over 5 years worth of plotted data is an “estimate” just by looking at it? Nope.
Instead you have to read the Monthly Documentation file, download a CSV file with the flag tag option checked, see the E flag then you would know that the plot that you are making from the helpful USHCN site has fake data on it. Also notice that the Graph option is at the top of the page and the area to get the CSV file is further down and it has that innocuous “Mean Temperature Flag” option without a clue of how important that is. There is also nothing mentioned about that option when you read the documentation either, matter of fact the information on that page looks geared to the FTP full dataset download then to the individual station web interface. So just like having to hunt through multiple pages on that wonderful credit card offer that came in the mail to find the fine print of how much it will really cost you, you need to apply it to US Government temperature datasets.
What makes it this even worse is that even NASA GISS doesn’t trust that “estimated” data for that record, they cut the data off they use at 2003 .
So we got data estimators and infillers at NASA and they don’t trust the data estimation and infilling at NCDC.
But you can trust that the science is settled and gamble the worlds economy on this don’t you know!