One More Reason to Distrust Global Warming Predictions

"Garbage in, garbage out" has become a cautionary maxim of the computer age, reminding us that bad data corrupts computer software and many other artifacts of modern technology. What then are we to make of global warming scientists who present us with temperature charts purporting to display changes in the global mean temparture for the last century-plus. Who was measuring global mean temperatures in the 1880s?

Political observers are familiar with "margin of error" in opinion polls. Polls seek to measure the political prospects of candidates competing for elected office. The "true value" of the candidate's political viability is established when the votes are counted. 


Similarly, the mathematics of physical measurements require knowledge of potential sources of errors in measurements to place bounds on the likely true value. Estimated errors are expressed as "error bars" in plots of empirical data.  The true value could be any value within the margin of the error.

With this background, I was astonished to see the assurance with which climatologists writing about global warming report Global Mean Temperature over Land & Ocean as far back as 1880 -- as shown in the US Government's "official" NOAA Chart reproduced below.

It is noteworthy that most reproductions of this chart in the popular press omit the error bars. Who could possibly have been measuring Global Mean Temperature (GMT) so accurately (±0.14°C) in 1880? For context, remember that in the 1880s, Mr. Stanley was searching for Dr. Livingston, who was totally lost while exploring Africa for the source of the Nile.  Put another way, as seen from the error bars, why has science's ability to measure global mean temperature increased in accuracy only by a factor of two (±0.07°C)  in the last 125 years?

Global Mean Temperature
Source

To answer these questions, I read the scientific papers published by the authors of this chart, papers available on the NOAA site sourced for the above chart.  This reading provided me with new revelations about "climate models". Echoing the title of the chart's ordinate, I begin to suspect this climate record is "anomalous" in more ways than one. 

Before I report on the scientific papers however, I would like to remark on physical measurements, particularly temperature measurement.  To minimize errors, it is best to take many measurements of the same phenomena and then average the result.  Statistics tells us that in the absence of systematic errors the average of a large number of measurements is more likely to reflect the true value than just one measurement.  This is because the errors in each measurement are (correctly) assumed to be random in magnitude and sign (plus or minus) and so tend to cancel each other out when summed for purpose of averaging.  This is true in opinion-polling as well: a larger representative sample size is more likely to yield an accurate poll than one with fewer respondents.  So in general averaging of many measurements always yields a more accurate result.

The caveat noted above concerns systematic error.  Returning to our polling analogy, a systematic error might be the over-sampling of Democrats.  No matter how large the sample, if Democrats or overrepresented when compared to their true proportion in the voting population, then the poll will erroneously skew toward the Democrat candidate, and not reflect the true value. Similarly, instruments such as thermometers can display systematic error. Systematic error can be uncovered and possibly corrected by a process called calibration.

I offer these further remarks to indicate what is involved in taking temperature data with any hope of attaining 0.1 °C accuracy, particularly with thermometers available in 1880.

The two types of systematic errors commonly found in measurement instruments are "zero point drift" and "scale deviation".  Synchronizing clocks or watches is a ready example of correcting zero-point drift.  The true value of time is ultimately set by atomic-clocks in the Naval Observatory, which is the time displayed on GPS-equipped cellular phones.  A clock in your home may need to be reset after a temporary power outage. Until it is reset or "calibrated", the time it displays is incorrect.  When the clock is set to your cell-phone time, you are shifting its entire scale (the dial's hands or the numeric display) so that its value coincides with the true-value.  In this manner, whenever the time is "00.00 hours" (i.e. 12:00 Midnight) the reset clock and cell-phone true-time display the same "zero point".  A watch running fast or slow illustrates scale deviation; the seconds it ticks off are not exactly one second in length.  Even if its "zero point" is reset, it won't keep accurate time.  The error may be difficult to see at first, but after a while the error will accumulate and it will display time that is ahead or behind true time.  To correct this error requires the help of a watch-maker or else watch replacement. 

Similar calibration problems arise in temperature measurement. Thermometers can display zero-drift and scale deviation. The zero-point of the Celsius scale is the freezing point of water (32 °F).  A flask of ice-water is exactly 0° C; this is because any heat added to the ice-water will not change its temperature but rather will melt some ice.  Conversely, any heat extracted from the ice-water will not change its temperature but will rather freeze some water.  So long as there is both ice and water present, an ice-water bath is a reliable way to set the zero point of a thermometer.  When placed in the bath, the thermometer should read 0 °C; if instead it reads, for example, 0.2 °C then it is out of calibration and every data point collected with that thermometer will be off by 0.2 °C.  Such data cannot be trusted.  As with checking a watch at two different times to see if it is running fast or slow, scale deviation can be detected by measuring two temperature points to make sure they are both correct.  Boiling water at sea level[1] is exactly 100 °C (212 °F).  So long as both water and steam-bubbles are present, the boiling water does not change in temperature and represents another convenient reference for calibrating thermometers. Only when a thermometer is calibrated against these two reference points (or other "known" references -perhaps established by another calibrated thermometer) then can it be said to be able to measure the true temperature without systematic error.

Let's consider for a moment what it took to measure the annual mean temperature in just one city using methods available in 1880 or even 1930.  Temperature would have been taken by mercury thermometer, read by eye.  Clearly taking one reading per year is inadequate.  Should it be done monthly? Weekly?  It would seem most appropriate to carefully take a measurement at least once a day, ideally at the same time. Is that enough data?  Should the measurement always be taken in the sun? In the shade? At night? Is the method and scheduled adopted consistent with those employed by other meteorologists conducting similar measurements elsewhere? To ensure accuracy, it would be desirable to calibrate the thermometer at least once a year, read[2] it with precision, record the measurements legibly in a paper notebook, and report the methods and results from time to time in scientific correspondence and journals. These actions, day-by-day, year-by-year would require considerable skill, discipline and dedication from scientists of integrity.   Inadvertent deviations from the proper procedure could lead to systematic errors.

Fortunately for mankind there were pioneering weathermen in the 1800s almost exclusively in North America and Western Europe who were dedicated, skilled and successful scientists - the largely unsung heroes who founded meteorology.  And the foundation they laid in the 19th Century for the science of weather forecasting has paid-off with vast benefits to humanity. Nonetheless, were there enough reliable data points collected in the 1880s to reliably calculate Global Mean Temperature? 

In contrast, today the NOAA and NASA have access to all sorts of equipment such as automated weather stations equipped with precision thermocouples (i.e. electronic thermometers) and automated data-loggers, aircraft fleets, survey ships and satellites fitted with infrared detectors that can continuously survey the earth's surface temperatures from space.  Yet with all this highly-calibrated instrument technology generating enormous sets of precise data, our ability to calculate global mean temperature has only improved in accuracy by a factor of two over the past 125 years.  How to explain this incongruity?

By definition, Global Mean Temperature is a calculated value, derived from representative samples of reliable data collected in a manner that allows statistical averaging so that daily and seasonal fluctuations in temperature can be, so to speak, "averaged out".  Furthermore, sufficient geographic distribution of data sets should also be available to construct a meaningful "average" for all latitudes and longitudes of the globe.  Only after these statistical processes are completed can science "take the temperature" of the planet to see whether it "has a fever".

Figure 2 below is reproduced one of the source papers for Chart 1 above, "An Overview of Global Historical Climatology Network Temperature Database" by T.C. Peterson and R.S. Rose. Bulletin of the American Meteorological Society Vol 78 No. 12 Dec 1997 pp 2837-49 .  These gentlemen and their colleagues carefully collected, culled and catalogued historical temperature data to produce a reliable record.  As one would expect, reliable data sets from the late 19th are sparse.



How then to calculate GMT from such sparse data, which happens to be concentrated in North America and Europe?  Well the solution is not to calculate GMT from contemporary 1880 data but to reconstruct GMT by identifying correlations among modern geographic temperature data distributions and then projecting these same correlations backward onto the support of the relatively sparse empirical data available in earlier times. See for example "Global Merged Land-Air-Sea Surface Temperature Reconstruction Base on Historical Observations (1880-1997)" by T.M. Smith & R.W Reynolds, Journal of Climate 15 June 2005 Vol. 18 pp 2021-36.  In this regard, Chart 1 is somewhat misleading, for in the absence of a clarifying legend, most scientists would assume by convention that the data reported for 1880 would have been collected in 1880. 

Thus, "climate change" alarmism is based on two mathematical models: 1) a statistical model that "reconstructs" historical data using modern correlations projected backward for 100 years onto the thin support of sparse empirical data sets available then; and 2) a "climate model" that uses this very same reconstructed historical data as key input data to project climate forward for the next 100 years. 

My credentials as a statistician are non-existent and so I cannot critique their mathematics, but I do have severe doubts about their claims for accuracy in these reconstructions that lead to the tight error bars in Chart 1. I do know that Climate Model's double-extrapolation has potential to magnify any errors in the sparse underlying data, in the same way that the time-keeping error can accumulate in a watch that runs fast or slow.  A systematic error was recently discovered in modern North American climate data

It will be impossible to discover such errors in the data collected long ago with more primitive instrumentation, even though such errors might well be there.  And given that North American data is the central pillar for "reconstructing" the GMT climate record, James Hansen's reluctance to correct the recent US temperature record, and dismay about having to do so, is a display of incompetence.

Our focus should be on the error bars.  Please note that the blue trend line in Chart 1 is drawn for convenience and does not imply that this is the true or even likely historical temperature profile.  Statistics tells us that any line drawn through the error bars is just as likely to be the true value as any other.  So if the error bars in the early decades are just doubled to ±0.28 °C, it would be possible to draw entirely valid GMT profiles that are flatter, showing less temperature rise, and thus not be nearly as "alarming" as the one shown in Chart 1. 

Writing in the Wall Street Journal, Robert Lee Hotz cautioned us that alarming sloppiness in statistical studies is all too common. He cited a study byTufts University professor John Ioannidis:
"The hotter the field of research the more likely its published findings should be viewed skeptically, he determined."
No one was even attempting to measure GMT during the 1880s.  The GMT climate record is a statistical re-construction primarily based on modern data, which itself has been shown recently to be subject to systematic error in need of correction.  Although I am not a statistician, I have profound reservations about the purported accuracy of this reconstructed data.

There is also a more basic reason for my skepticism concerning climate models: 
Although climate scientists show no reluctance to assign high accuracy to their backward projections of global climate data, and although global-warming theorists boldly predict climate disasters 100 years from now, they all seem quite reticent about predicting climate conditions one or two years hence. 


[1] Bubbles form when they generate internal pressure greater than the pressure of air above the surface of the boiling water; atmospheric pressure varies with altitude, so cooks know to make "high altitude" adjustments to their recipes.

[2] Anyone who has used a 110°C mercury thermometer typically found in chemistry laboratories knows that is very difficult to read the scale with accuracy better than the degree marks themselves, and indeed the "Error in Measurements" webpage listed above records the precision of such thermometers as, at best,  ±0.2°C.

"Garbage in, garbage out" has become a cautionary maxim of the computer age, reminding us that bad data corrupts computer software and many other artifacts of modern technology. What then are we to make of global warming scientists who present us with temperature charts purporting to display changes in the global mean temparture for the last century-plus. Who was measuring global mean temperatures in the 1880s?

Political observers are familiar with "margin of error" in opinion polls. Polls seek to measure the political prospects of candidates competing for elected office. The "true value" of the candidate's political viability is established when the votes are counted. 


Similarly, the mathematics of physical measurements require knowledge of potential sources of errors in measurements to place bounds on the likely true value. Estimated errors are expressed as "error bars" in plots of empirical data.  The true value could be any value within the margin of the error.

With this background, I was astonished to see the assurance with which climatologists writing about global warming report Global Mean Temperature over Land & Ocean as far back as 1880 -- as shown in the US Government's "official" NOAA Chart reproduced below.

It is noteworthy that most reproductions of this chart in the popular press omit the error bars. Who could possibly have been measuring Global Mean Temperature (GMT) so accurately (±0.14°C) in 1880? For context, remember that in the 1880s, Mr. Stanley was searching for Dr. Livingston, who was totally lost while exploring Africa for the source of the Nile.  Put another way, as seen from the error bars, why has science's ability to measure global mean temperature increased in accuracy only by a factor of two (±0.07°C)  in the last 125 years?

Global Mean Temperature
Source

To answer these questions, I read the scientific papers published by the authors of this chart, papers available on the NOAA site sourced for the above chart.  This reading provided me with new revelations about "climate models". Echoing the title of the chart's ordinate, I begin to suspect this climate record is "anomalous" in more ways than one. 

Before I report on the scientific papers however, I would like to remark on physical measurements, particularly temperature measurement.  To minimize errors, it is best to take many measurements of the same phenomena and then average the result.  Statistics tells us that in the absence of systematic errors the average of a large number of measurements is more likely to reflect the true value than just one measurement.  This is because the errors in each measurement are (correctly) assumed to be random in magnitude and sign (plus or minus) and so tend to cancel each other out when summed for purpose of averaging.  This is true in opinion-polling as well: a larger representative sample size is more likely to yield an accurate poll than one with fewer respondents.  So in general averaging of many measurements always yields a more accurate result.

The caveat noted above concerns systematic error.  Returning to our polling analogy, a systematic error might be the over-sampling of Democrats.  No matter how large the sample, if Democrats or overrepresented when compared to their true proportion in the voting population, then the poll will erroneously skew toward the Democrat candidate, and not reflect the true value. Similarly, instruments such as thermometers can display systematic error. Systematic error can be uncovered and possibly corrected by a process called calibration.

I offer these further remarks to indicate what is involved in taking temperature data with any hope of attaining 0.1 °C accuracy, particularly with thermometers available in 1880.

The two types of systematic errors commonly found in measurement instruments are "zero point drift" and "scale deviation".  Synchronizing clocks or watches is a ready example of correcting zero-point drift.  The true value of time is ultimately set by atomic-clocks in the Naval Observatory, which is the time displayed on GPS-equipped cellular phones.  A clock in your home may need to be reset after a temporary power outage. Until it is reset or "calibrated", the time it displays is incorrect.  When the clock is set to your cell-phone time, you are shifting its entire scale (the dial's hands or the numeric display) so that its value coincides with the true-value.  In this manner, whenever the time is "00.00 hours" (i.e. 12:00 Midnight) the reset clock and cell-phone true-time display the same "zero point".  A watch running fast or slow illustrates scale deviation; the seconds it ticks off are not exactly one second in length.  Even if its "zero point" is reset, it won't keep accurate time.  The error may be difficult to see at first, but after a while the error will accumulate and it will display time that is ahead or behind true time.  To correct this error requires the help of a watch-maker or else watch replacement. 

Similar calibration problems arise in temperature measurement. Thermometers can display zero-drift and scale deviation. The zero-point of the Celsius scale is the freezing point of water (32 °F).  A flask of ice-water is exactly 0° C; this is because any heat added to the ice-water will not change its temperature but rather will melt some ice.  Conversely, any heat extracted from the ice-water will not change its temperature but will rather freeze some water.  So long as there is both ice and water present, an ice-water bath is a reliable way to set the zero point of a thermometer.  When placed in the bath, the thermometer should read 0 °C; if instead it reads, for example, 0.2 °C then it is out of calibration and every data point collected with that thermometer will be off by 0.2 °C.  Such data cannot be trusted.  As with checking a watch at two different times to see if it is running fast or slow, scale deviation can be detected by measuring two temperature points to make sure they are both correct.  Boiling water at sea level[1] is exactly 100 °C (212 °F).  So long as both water and steam-bubbles are present, the boiling water does not change in temperature and represents another convenient reference for calibrating thermometers. Only when a thermometer is calibrated against these two reference points (or other "known" references -perhaps established by another calibrated thermometer) then can it be said to be able to measure the true temperature without systematic error.

Let's consider for a moment what it took to measure the annual mean temperature in just one city using methods available in 1880 or even 1930.  Temperature would have been taken by mercury thermometer, read by eye.  Clearly taking one reading per year is inadequate.  Should it be done monthly? Weekly?  It would seem most appropriate to carefully take a measurement at least once a day, ideally at the same time. Is that enough data?  Should the measurement always be taken in the sun? In the shade? At night? Is the method and scheduled adopted consistent with those employed by other meteorologists conducting similar measurements elsewhere? To ensure accuracy, it would be desirable to calibrate the thermometer at least once a year, read[2] it with precision, record the measurements legibly in a paper notebook, and report the methods and results from time to time in scientific correspondence and journals. These actions, day-by-day, year-by-year would require considerable skill, discipline and dedication from scientists of integrity.   Inadvertent deviations from the proper procedure could lead to systematic errors.

Fortunately for mankind there were pioneering weathermen in the 1800s almost exclusively in North America and Western Europe who were dedicated, skilled and successful scientists - the largely unsung heroes who founded meteorology.  And the foundation they laid in the 19th Century for the science of weather forecasting has paid-off with vast benefits to humanity. Nonetheless, were there enough reliable data points collected in the 1880s to reliably calculate Global Mean Temperature? 

In contrast, today the NOAA and NASA have access to all sorts of equipment such as automated weather stations equipped with precision thermocouples (i.e. electronic thermometers) and automated data-loggers, aircraft fleets, survey ships and satellites fitted with infrared detectors that can continuously survey the earth's surface temperatures from space.  Yet with all this highly-calibrated instrument technology generating enormous sets of precise data, our ability to calculate global mean temperature has only improved in accuracy by a factor of two over the past 125 years.  How to explain this incongruity?

By definition, Global Mean Temperature is a calculated value, derived from representative samples of reliable data collected in a manner that allows statistical averaging so that daily and seasonal fluctuations in temperature can be, so to speak, "averaged out".  Furthermore, sufficient geographic distribution of data sets should also be available to construct a meaningful "average" for all latitudes and longitudes of the globe.  Only after these statistical processes are completed can science "take the temperature" of the planet to see whether it "has a fever".

Figure 2 below is reproduced one of the source papers for Chart 1 above, "An Overview of Global Historical Climatology Network Temperature Database" by T.C. Peterson and R.S. Rose. Bulletin of the American Meteorological Society Vol 78 No. 12 Dec 1997 pp 2837-49 .  These gentlemen and their colleagues carefully collected, culled and catalogued historical temperature data to produce a reliable record.  As one would expect, reliable data sets from the late 19th are sparse.



How then to calculate GMT from such sparse data, which happens to be concentrated in North America and Europe?  Well the solution is not to calculate GMT from contemporary 1880 data but to reconstruct GMT by identifying correlations among modern geographic temperature data distributions and then projecting these same correlations backward onto the support of the relatively sparse empirical data available in earlier times. See for example "Global Merged Land-Air-Sea Surface Temperature Reconstruction Base on Historical Observations (1880-1997)" by T.M. Smith & R.W Reynolds, Journal of Climate 15 June 2005 Vol. 18 pp 2021-36.  In this regard, Chart 1 is somewhat misleading, for in the absence of a clarifying legend, most scientists would assume by convention that the data reported for 1880 would have been collected in 1880. 

Thus, "climate change" alarmism is based on two mathematical models: 1) a statistical model that "reconstructs" historical data using modern correlations projected backward for 100 years onto the thin support of sparse empirical data sets available then; and 2) a "climate model" that uses this very same reconstructed historical data as key input data to project climate forward for the next 100 years. 

My credentials as a statistician are non-existent and so I cannot critique their mathematics, but I do have severe doubts about their claims for accuracy in these reconstructions that lead to the tight error bars in Chart 1. I do know that Climate Model's double-extrapolation has potential to magnify any errors in the sparse underlying data, in the same way that the time-keeping error can accumulate in a watch that runs fast or slow.  A systematic error was recently discovered in modern North American climate data

It will be impossible to discover such errors in the data collected long ago with more primitive instrumentation, even though such errors might well be there.  And given that North American data is the central pillar for "reconstructing" the GMT climate record, James Hansen's reluctance to correct the recent US temperature record, and dismay about having to do so, is a display of incompetence.

Our focus should be on the error bars.  Please note that the blue trend line in Chart 1 is drawn for convenience and does not imply that this is the true or even likely historical temperature profile.  Statistics tells us that any line drawn through the error bars is just as likely to be the true value as any other.  So if the error bars in the early decades are just doubled to ±0.28 °C, it would be possible to draw entirely valid GMT profiles that are flatter, showing less temperature rise, and thus not be nearly as "alarming" as the one shown in Chart 1. 

Writing in the Wall Street Journal, Robert Lee Hotz cautioned us that alarming sloppiness in statistical studies is all too common. He cited a study byTufts University professor John Ioannidis:
"The hotter the field of research the more likely its published findings should be viewed skeptically, he determined."
No one was even attempting to measure GMT during the 1880s.  The GMT climate record is a statistical re-construction primarily based on modern data, which itself has been shown recently to be subject to systematic error in need of correction.  Although I am not a statistician, I have profound reservations about the purported accuracy of this reconstructed data.

There is also a more basic reason for my skepticism concerning climate models: 
Although climate scientists show no reluctance to assign high accuracy to their backward projections of global climate data, and although global-warming theorists boldly predict climate disasters 100 years from now, they all seem quite reticent about predicting climate conditions one or two years hence. 


[1] Bubbles form when they generate internal pressure greater than the pressure of air above the surface of the boiling water; atmospheric pressure varies with altitude, so cooks know to make "high altitude" adjustments to their recipes.

[2] Anyone who has used a 110°C mercury thermometer typically found in chemistry laboratories knows that is very difficult to read the scale with accuracy better than the degree marks themselves, and indeed the "Error in Measurements" webpage listed above records the precision of such thermometers as, at best,  ±0.2°C.