Saturday, March 24, 2012

24 January 2011 – Maunderings on trend projection


Last week, the University of Calgary published a study predicting that the world’s glaciers and ice shelves would collapse by the year 3000.  You can read about it here.

This projection contradicted data published by the National Climatic Data Centre (NCDC) in the US, which recently released its temperature figures for the month of December.  I don’t know how to break this to you, but melting glaciers are the least of our worries.  According to present climatic trends, Minnesota will be uninhabitable in only a little over two centuries. 

Well...more uninhabitable. 

The data published by the NCDC suggest that, at some point in December 2289, the average temperature in Minnesota will reach absolute zero.  Helium will become a solid, all molecular motion will cease, and we’ll finally find out whether the laws of thermodynamics are really just a bunch of hokum made up by physicists who want to keep all the perpetual motion for themselves.

Impossible, you say?  Absolute zero can’t be achieved even in a lab, you say?  Oh, ye of little faith!  The data are indisputable.  Minneapolis is hurtling towards icy oblivion at this very moment, careering into a frigid abyss whence there can be no return.  You don’t have to trust me - just look at the trend!

Figure 1: NCDC climate data, Minnesota, December, 2002-2009; trend = -16.44F/decade (note A)

According to the last seven years of official US government temperature data, the average temperature in Minnesota in December is declining by 16.44 degrees Fahrenheit per decade.  It’s getting cold, fast.  There’s a silver lining, though; in only a little over 50 years the average December temperature will have fallen to well below -80 C, which means that all of that pesky carbon dioxide will freeze, precipitate as snow, and can be shovelled up and packed away in the reefer, never to trouble us again.

You can’t argue with figures; it’s going to happen!  These are MEASURED DATA, people!

Meanwhile, did you know that the long-term trend of a sine wave is a straight line?  No, I’m not kidding.  The equation y = sin(x)m where x is expressed in radians produces a repeating curve that gives values for y ranging between 1 and -1.  If we pick two points on that curve, we can extract a trend line.  For example: (π/2,1) and (-π/2,-1) gives a trend line where Δy / Δx = 2/π.  The equation of that line is y = 2x / π.  This, again, is not the same as the equation for the sine curve; it’s a straight line trending upwards to the right of the chart, forever, whereas the sine wave cycles between a y value of 1 and -1, never exceeding either.  To drive the idea home, try it with different points.  Selecting the points (π,0) and (-π,0) gives you a line with a slope of 0.  Selecting the points (-3π/2,1) and (-π/2,-1) gives you a line with a slope of -2/π.

See what I’m getting at?  By carefully selecting the end-points for your trend analysis, you can derive, from a simple sine curve, a linear trend proceeding infinitely upwards at a slope of 2/π; a linear trend proceeding infinitely downwards at a slope of -2/π; or a linear trend proceeding infinitely onwards at a slope of 0.  And none of them bear any genuine relation to the curve from which they were derived.  In short, when you project a linear trend from a cyclical curve, the direction of the trend depends on the end-points you select.  Select your end-points carefully, and you can produce just about any trend you like.

Okay, back to Minnesota, where by the end of the century they’ll be dodging puddles of liquid oxygen in the parking lot.  As will be obvious from the foregoing examples, deriving a linear trend from a curve and projecting it indefinitely into the future is clearly an exercise that is open to manipulation based on how cleverly you select your endpoints.  In asking the NCDC plot generator to give me the above plot, I selected a year with an unusually warm December (2002) for the start of the calculation, and a year with an unusually cold December (2009) for the end.  Doing that gave me a linear temperature trend of -16.44F per decade - which, extended into the future, means that in a century, the average winter temperature in December in Minnesota will be about -170F.

Bundle up, right?  It won’t help.  If we change the end-points to 2005 and 2007, the NCDC gives us a trend of -23.5F/decade.  Absolute zero in only a century and a half!  Minneapolis is doomed!

But it’s not doomed, because three years do not a trend make.  Nor do twenty, especially when you can pick which twenty years you look at to give you the result you want.  Let’s go back to the Minnesota data.  Take a look at 2 different twenty-year periods in the data:

Figure 2: NCDC climate data, Minnesota, December, 1931-1951; trend = -1.82F/decade (note A)

Figure 3: NCDC climate data, Minnesota, December, 1983-2003; trend = +5.46F/decade (note A)

Which of these trends is accurate?  Which one should we trust to give us an idea of where temperature is going over the long term?

The answer is ‘neither’.  These are all examples of a failure of analysis known as the ‘end-point fallacy’, which is the argument that a short-term trend extracted from a long-term series is necessarily representative of the long-term series, and may be substituted for it.  In logic this is known as the fallacy of composition, i.e., inferring the shape of a composite entity from the shape of one of its constituent parts - akin to inferring, from the shape and structure of a tire, that an entire car must necessarily be circular and made of fibreglass belts and vulcanized rubber. In reality, though, you can only determine the shape and structure of the car by observing the whole car - just as you can only determine the shape and structure of a sine wave by observing the whole wave (at least for a sufficient number of cycles to determine its equation).

This is one of the key weaknesses of linear trend projection.  While it is one of the least unreliable forecasting tools available to scientists, its utility is highly conditional, and it is weakened by imperfect understanding of the nature of the trends we are attempting to project.  How do we get around it?  Well, we MUST extract a linear trend from a cyclical phenomenon, we can minimize our failures by maximizing the observational baseline for the trend.  So in the case of December temperatures in Minnesota, we have to look at all the data available.  There’s quite a lot, actually.  More than a hundred years’ worth.

Figure 4: NCDC climate data, Minnesota, December, 1895-2010; trend = +0.1F/decade (note A)

If we expand the analysis of the curve to the full 115 years of data that are available, we find that the average annual December temperature in Minnesota has ranged from a low of 0 F in the early 1980s, to a high of 25F in the 1940s.  We also find that the trend line is an increase of +0.1F/decade (or +0.056C/decade).  December temperatures in Minnesota warmed at a rate of half a degree Celsius per century (which is less than 3/4 of the 0.7C/century increase in average global temperature posited by the IPCC).  The warming trend in Minnesota December temperatures, furthermore, is 1/250th or 0.4% of the observed variability.  That’s a lot more reasonable than predictions of plunging or skyrocketing temperatures based on linear projections derived from selection of more proximate end-points.

The real problem with linear trend analysis, of course, is that the temperature trend isn’t really linear at all - it’s cyclical.  If we projected even that slight half-degree-per-century warming 1000 years into the future, we’d logically conclude that Minnesota Decembers would be 5C warmer than average in the year 3011.  But how valid would such a conclusion be?  And more to the point, how useful would it be?  Presumably somebody will still be living in Minnesota in the year 3011 (it’s at least a little more likely if the winters are 5 degrees warmer than they are at present); but if we consider the range of unforeseeable events that might “shock” us, and the fact that the trend we are linearly projecting is cyclical and demonstrably non-linear, then the unreliability and inutility of long-term linear projection become depressingly clear.  The longer the predictive timeline, the greater the statistical likelihood of error.

When we attempt to apply this method to social scientific analysis, we are also crippled by the fact that linear trend projection makes no allowance for unforeseeable, watershed events.  The classical example is London’s great manure crisis.  In 1900, the city of London had 11,000 horse-drawn cabs and several thousand buses, each of which required twelve horses per day.  Add to these the horses necessary to draw goods wagons, carts and private conveyances, and the number becomes quite significant.  New York City, in 1900, was in similar straits, with more than 100,000 horses, producing an impressive 2,500,000 pounds of manure per day, all of which had to be collected and removed.  This “fertilizer crisis” was so severe in large cities that one author writing in 1894 for The Times of London predicted that “in 50 years every street in London would be buried under 9 feet of manure.” (note B)

This did not happen, of course; the internal combustion engine, still a relative novelty in 1894, eventually replaced the horse – and far more rapidly in the cities than in the countryside.  A half-century after 1894, the streets of London were indeed buried – but they were buried in rubble rather than manure, the result of five years of mechanized warfare featuring high-altitude piston-engined bombers, pulse-jet-driven flying bombs, and liquid-fuelled ballistic rockets – weapons that would have been deemed the height of fantasy by a writer sitting in fin-de-siècle Britain, solemnly predicting that the Imperial capital was doomed to be smothered in equine feces.  The author of that piece had made the fatal error of deriving a linear trend from a non-linear phenomenon, and projecting it fifty years into the future.  He also made insufficient allowance for the “technology shock” of the internal combustion engine - which of course he could never have reasonably been expected to allow for in the first place.  “Shocks” are by definition ex-post-facto phenomena.  They only “shock” us because we failed to foresee them; if they can be foreseen, then they cannot shock us.

And the solution to that part of the problem is parsimony.  If we absolutely must derive linear trends from non-linear phenomena and attempt to project them into the future, then the only way to minimize the certainty of error is to use all of the data available to us to better understand the nature of the phenomenon were are trying to understand; to acknowledge all of the possible sources of error, and bubble-wrap our analysis in caveats; and to keep the period of our artificial linear projection as short as we possibly can.

The importance of parsimony in trend projection cannot be overstated.  Exactly 100 years ago, the pre-eminent technologist of the era, Thomas Edison, was asked to envision the technology of the year 2011, a century hence.  His answers were published in the June 23, 1911 edition of the Miami Metropolis.  Among Edison’s predictions were the following:
·         Electric trains driven by hydroelectric power (correct);
·         The demise of the steam engine (“as remote an antiquity as the lumbering coach of Tudor days” - incorrect, as coal- and oil-burning steam turbines produce most of the electricity on the planet, just as they did in Edison’s day);
·         Travelers “will fly through the air, swifter than any swallow, at a speed of two hundred miles an hour” (right on flight, but off by a factor of four on velocity);
·         Houses will be built and furnished entirely out of steel (“The baby of the twenty-first century will be rocked in a steel cradle; his father will sit in a steel chair at a steel dining table, and his mother’s boudoir will be sumptuously equipped with steel furnishings” - incorrect; while steel is more widely used these days, plastic is the ubiquitous material, while houses are still built predominantly out of concrete, bricks and wood);
·         Future books will be printed entirely on “leaves of nickel, so light to hold that the reader can enjoy a small library in a single volume. A book two inches thick will contain forty thousand pages, the equivalent of a hundred volumes; six inches in aggregate thickness, it would suffice for all the contents of the Encyclopedia Britannica. And each volume would weigh less than a pound.” (Incorrect - although this would be really cool);
·         We’ll all be riding in golden taxis due to the demise of gold as a monetary standard.  Edison was right about the last, but for the wrong reason; he argued that we would soon be able to transmute iron into gold (“We are already on the verge of discovering the secret of transmuting metals, which are all substantially the same in matter, though combined in different proportions...Before long it will be an easy matter to convert a truck load of iron bars into as many bars of virgin gold.” - very incorrect, to say the least).(note D)
It’s crucial that we recognize that in making these predictions, Edison was not predicting anything revolutionary; he was working from known technological discoveries and ideas that existed in his time, and projecting them a century forward.  Why shouldn’t he predict aircraft flying at 200 mph in another century?  Airplanes were already flying at close to 100 mph when he made his prediction.  Doubling that after a century’s worth of work would not be a stretch.  As for being right on electric trains and hydroelectric power, this isn’t surprising given that Edison had himself opened one of the first hydroelectric generating stations thirty years earlier, in 1882.  None of this was truly earth-shattering.
What’s more important is what he didn’t predict.  He predicted none of the things that really changed the modern world: antibiotics, jet engines, television, nuclear power, nuclear weapons, genetic engineering, synthetic polymers, computers, space travel...the list is endless.  It’s not Edison’s fault; the man wore starched collars, had never heard of a “tank”, and might have taken over-the-counter radium pills.  Even in places where there were clues - the “wireless”, for example, was well-known in his time, and the base technologies necessary for “television” would be in place before his death - Edison didn’t “blue-sky” anything.  Everything he predicted was evolutionary, not revolutionary.  And he still got most of it wrong.
Let’s be fair, though.  How could Edison have rationally predicted space travel anyway? Even if he thought it might one day be possible, there was nothing to base his prediction on.  It would be another 15 years before Goddard launched the first liquid-fuelled rocket.  Having never seen a liquid-fuelled rocket, how could Edison have imagined that only fifty years after Goddard’s launch - which Edison lived to see - we would be using colossal versions of that rocket to dispatch robots on journeys that would take them out of the Solar System (and to threaten each other with weapons capable of unimaginable devastation)?  Even if he’d dreamed of space flight, like his contemporary H.G. Wells had done, Edison, as a serious and practical scientist, couldn’t have made such predictions without sounding like a fantasist at best, and a lunatic at worst.  Voyager was simply not predictable from the knowledge base of his era.  None of the reasonable “trend lines” of 1911 pointed at space travel.  There was no rational way to get to “here” - the present that we know - from “there”, Edison’s day.  Meanwhile, those trend lines that did make sense to him pointed at a lot of things - like nickel books, steel houses, and gold taxicabs - that were never to be.  Edison’s only accurate predictions in that article - electric trains, hydro power, and 200 mph aircraft - were simply marginal refinements of things that were already being done.
Minnesota’s impending icy doom and Edison’s gold taxicabs illustrate the problem that analysts face whenever we are asked to look to the future.  The best we can do is work from a comprehensive knowledge of the past, and make reasonable, parsimonious, short-term predictions based on every last scrap of data we can muster.  The alternative is to eschew study of historical trends, selecting our endpoints creatively to hype the story we’re trying to sell, and projecting the derived linear trends decades into the future, secure in the knowledge that we’ll all be retired or dead before our predictions are disproven.  Even if we take the prudent, scientific course, like Edison did, we’re going to be wrong most of the time.  The people who, in 1911, invested in hydro power, electric trains and aircraft presumably did fairly well.  Those who sold all of their soon-to-be-worthless gold and put their money into the nickel book-printing and all-steel-housing industries, though...
Bottom line, whenever we pull the lever on the What-If? machine, we’re taking a risk.  We can apply science to mitigate that risk, or we can allow our imaginations free rein and have fun with it.  Like that chap in London who in 1894 projected a near-term trend into the distant future and found that it led inevitably to nine-foot-deep piles of manure, I guess it all comes down to what we feel like shovelling.

(A)    []

(B)    Stephen Davies, “The Great Horse Manure Crisis of 1894”, The Freeman, September 2004, 33 [].