Parsing Is Not Understanding

The substantial advances in natural language processing made by IBM’s “Watson” supercomputer, while genuinely impressive, have unfortunately given rise to exaggerated claims of the sort that is all too common in computer science. Our tendency to anthropomorphize our creations has led many to uncritically claim that Watson has “intelligence” or is able to understand “meaning” in words. Even less soberly, some are envisioning a day when such “artificial intelligences” will make humans obsolete. These silly claims are grounded in a philosophical sloppiness that fails to distinguish between concepts and their representations, between signal processing and subjective awareness, between parsing and understanding. I have already addressed some of these errors in the eighth chapter of Vitalism and Psychology.

While a little fanciful anthropomorphizing of a computer may seem harmless now, there is a grave danger that we will be led into disastrous social and ethical decisions when computers are able to mimic intelligent behavior more convincingly. As an extreme example, if we were to take seriously the claim that a computer has rendered humans obsolete, we would foolishly replace ourselves with a bunch of unaware machines sending signals to each other, yet having no interior psychological life. Alternatively, we might decide that machine “intelligences” should enjoy rights formerly reserved only to humans.

These absurdities can be avoided if we confront the reality that there is nothing fundamentally different about the behavior of supercomputers like Watson as compared with its simpler predecessors. All these machines do is process signals algorithmically. They have no intensional understanding of meaning. What we call a computer’s “understanding” or “intelligence” is really how it treats a certain signal object. This is strictly determined by its hard wiring and its program (though the latter may include random variables). It is completely unnecessary for the computer to know what it is doing. For example, Watson may distinguish which of the several definitions of the word “bat” is intended by context, but this distinction does not involve actually knowing or seeing a baseball bat or a flying mammal. It is a strictly functionalistic analysis of language, selecting one of several possible attributions based on a probabilistic analysis using syntactic context.

Years ago, I wrote a C program that solves quartic polynomial equations, which was simple enough to run on an IBM 386. This program did not give the computer the power to understand higher mathematics. I simply reduced an intelligible process to an algorithm that a machine could execute without understanding anything about anything. The computer did not know it was doing math any more than a chess program knows it is playing chess. The same is true with respect to Watson and language. It has not the slightest grasp of conceptual meaning. The impressive achievement in its programming is reducing the vast possibilities of natural language parsing to an executable algorithm that has a high degree of accuracy (though not perfect) in its results.

It is certainly not true that Watson understands language the same way humans do, much as Deep Blue did not play chess as humans do. Quite simply, humans do not have the computing ability to explore millions of possibilities in a few seconds, so that is certainly not how we identify the meanings of words in speech in real time. We are able to intuit or directly understand the meanings of words, so we do not have to do much deep analysis to figure out how to interpret ordinary conversation. The great power of rational understanding is that we can get directly at the answer without walking through countless possibilities. This is why I was much more impressed with Kasparov than with Deep Blue, for Kasparov was able to keep the match competitive even though he could not possibly go through millions of possibilities each turn. He had real wisdom and understanding, and could intuitively grasp the most likely successful move on each turn, with a high degree of accuracy.

Some, unwilling to accept a fundamental distinction between computers and authentic rational beings, have sought to demote the latter to the status of computers. They will say, in effect, that what we have said about how computers work is perfectly true, but human beings do not do anything more than this. All we do is process data, and relate inputs to outputs. This position can hardly be characterized as anything but profound willful ignorance. A moment’s careful introspection should suffice to demolish this characterization of human intelligence.

Unfortunately, philosophical naivete is endemic in computer science, which purports to reduce intensional meaning and understanding to its extensional representations. This is linguistically naive as well, for if a signal is an arbitrary sign for a concept, it follows that meaning is not be found in the signal itself. The computer never interprets anything; it only converts one set of signals into another set. It is up to us rational beings to interpret the output as an answer with meaning.

Highly accurate natural language processing is an important step toward establishing credible computerized mimicry of intelligent processes without subjective understanding. Although we can never create genuine intelligence using the current modalities of computer engineering, we might do well enough to create a superficially convincing substitute. In a world that increasingly treats human beings with a functionalistic input-output mentality, such developments could have profound social and ethical implications, which I treat in my new short story, “The Turing Graduate,” to be published soon.

Cooling the Global Warming Rhetoric

There was no global warming from 2001 through 2009, yet we are now told that this past decade was the warmest on record. Both assertions are factually correct, though they paint very different pictures of the state of the world’s climate.

As we can see in the above graph, 1998 was an unusually hot year (due to “El Niño”) compared with the previous decade, followed by an uncharacteristically cool 1999. Temperatures rose in 2000 and 2001, short of the 1998 peak, yet higher than the average for the 1990s. From 2001 through 2009, the annual global mean temperature anomaly remained about the same, within measurement error. The media likes to publish things like “this was the third hottest year on record,” but such year-to-year rankings are meaningless, as they are distinguished only by statistical noise. Scientists know better than that, yet NASA researchers have often played a media game where they announce that a given year is on track to be the hottest on record. In mid-2005, for example, NASA announced that the global temperature was on track to surpass the 1998 record. Such a distinction is meaningless, since the temperatures were about the same, within the accuracy of the models.

Note we do not measure the average global temperature, but a temperature anomaly, or change with respect to some baseline. In absolute terms, the average surface temperature of Earth is about 15 degrees Celsius, a figure that is only accurate to a full degree. It is impossible to come up with an accurate global mean temperature in absolute terms, due to the lack of weather stations in regions such as oceans and deserts, which magnifies the overall uncertainty. How then, can we come up with more accurate measurements of global temperature change?

First, a baseline period is chosen. A common choice is the period from 1951-1980, which was relatively cool compared to the decades before and after. The graph shown uses 1961-1990 as a baseline, and many current measurements use 1971-2000, since there is more data for this period. For each temperature station or location, we compare the current measured temperature with the average measurement at that station over the baseline period, and the difference is called the “anomaly”; really, simply an increase or decrease with respect to our arbitrary norm (i.e., the baseline). We can measure each local temperature anomaly to the accuracy of the thermometer, and climate scientists maintain that interpolating the anomaly between stations creates little error, since temperatures tend to rise and fall in a region by about the same amount, and the mean temperature anomaly is accurate to 0.05 degrees C. This is more accurate than any individual measurement, due to the error propagation of averages, which adds in quadrature, so the error in the mean is less than the error of any individual measurement.

This assessment of the accuracy depends on the assumption that our weather stations are a fair representative sample of global climate, and that our interpolations over local regions are valid and accurate. Logically, we cannot know the actual global temperature increase or decrease any more accurately than we know the global mean temperature: i.e., within 1 degree Celsius. What we call the “global mean temperature anomaly” is really a weighted average of all measurement stations, which we assume to be a representative sample of the entire globe. Strictly speaking, this is not the same as “how much the temperature increased globally”.

In fact, it is arguable that the notion of “global temperature” is a will-o’-the-wisp. The atmosphere is in constant thermal disequilibrium, both locally and globally, making it practically, and even theoretically, impossible to have a well-defined temperature at any point in time, since the notion of temperature presumes a large-scale equilibrium. Even if we could integrate the momentary temperature over time at a given point, the “average” value we get would not at all be representative of the actual temperature, since more time is spent near the extremes than at the average. This is why we take the maximum and minimum temperatures of each 24-hour period (the daily “high” and “low”) as our data set for each measurement station. There is also the difficulty that surface temperature varies considerably from 0 to 50 feet above sea level, and there is no rigid definition of the altitude at which “surface temperature” should be measured. This introduces the real possibility of systematic error, and makes our assessment of the accuracy of the mean anomaly somewhat suspect.

Let us assume, for the sake of argument, that mean temperature anomaly really is accurate to 0.05 degrees Celsius. Then if 2005 was 0.75 degrees Celsius above the 1950-1980 average, while 1998 was 0.71 Celsius above average, it is statistically meaningless to announce that 2005 was the hottest year on record, since we do not know the mean temperature anomaly accurately enough to know whether 1998 or 2005 was hotter.

All of the foregoing deals with straight temperature records, not adjusting for anything except regional interpolation. However, there are many tricks – not falsifications, but misleading portrayals – one can perform in order to paint the picture you wish to show. For example, you can subtract out natural oscillations caused by things like El Niño, volcanic explosions, and variations in solar irradiance, to show what warming would have occurred without these natural effects. This methodology can be dubious, for you are subtracting natural cooling effects without subtracting potential warming effects that may be coupled to them.

There was dramatic global cooling in 2007, driven by strong La Niña conditions (the decrease in solar activity that year was not enough to account for the cooling). The cooling trend continued in 2008. Although the stagnation in annual mean temperature throughout the decade is still consistent with the hypothesis of long-term anthropogenetic global warming, some climate scientists recognized that this presented bad PR for their views, and reacted accordingly.

In October 2009, Paul Hudson of the BBC (an organization generally sympathetic to the AGW hypothesis) wrote that there was no warming since 1998. This was a fair and accurate presentation of reality, as climate scientists privately admitted. According to the infamous East Anglia e-mails, even the strident advocates of the AGW hypothesis acknowledged that there was a lack of warming. In response to the Hudson article, Kevin Trenberth wrote: “The fact is that we can’t account for the lack of warming at the moment and it is a travesty that we can’t.” Tom Wigley disagreed that the lack of warming was unexplainable, yet even he admitted there was a lack of warming:

At the risk of overload, here are some notes of mine on the recent lack of warming. I look at this in two ways. The first is to look at the difference between the observed and expected anthropogenic trend relative to the pdf [probability density function] for unforced variability. The second is to remove ENSO [El Niño Southern Oscillation], volcanoes and TSI [Total solar irradiance] variation from the observed data.

Both methods show that what we are seeing is not unusual. The second method leaves a significant warming over the past decade.

Wigley is basically saying that if certain natural variable factors were removed, there would have been warming. However, these variations actually did occur, so there was not actually warming. Wigley shows how the lack of warming might be explained by other factors, yet he does not deny the fact that, in actuality, there was no warming.

It is very dubious to play this game of “what would have happened” in climate history, since all atmospheric events are interconnected in a chaotic flux, making it impossible to cleanly subtract out an effect as if it were an autonomous unit. This is why analysis trying to show what percent of warming is caused by which effect is really unhelpful. I have little confidence in the ability of climate scientists to make long-term predictions about chaotic systems with insanely complex oceanic and biological feedback. From my recollections at MIT, those who majored in earth and atmospheric sciences weren’t the most mathematically gifted, so I doubt that they’ve since managed to work such a mathematical miracle. It is hardly encouraging that climate scientist Gavin Schmidt claimed that the Stefan-Boltzmann constant should be doubled since the atmosphere radiates “up and down”! I don’t fault James Hansen for erroneously predicting to Congress in 1989 that there would be a decade of increased droughts in North America. What is objectionable is when such a claim is presented as the only scientifically plausible inference. I know that CO2 is a greenhouse gas, and that it should cause warming, all other things being equal, but all other things aren’t equal.

What do we know, then, if anything? The graph below shows a five-year moving average of the global mean temperature anomaly. (The moving average is to smooth out irregular spikes.)

From this graph, several distinct periods can be discerned with regard to temperature trends.

1860-1900: constant to within +/- 0.1 deg C.

1900-1910: dropped to 0.2 deg C below the 1860-1900 average.

1910-1940: warmed 0.5 deg C from 1910 low, or 0.3 deg C above 1860-1900 avg.

1940-1950: dropped 0.2 deg C from 1940 peak, or 0.1 deg C above 1860-1900 avg.

1950-1975: stable to within 0.1 deg C (that’s why it’s used as a baseline); in 1975, 0.2 deg above 1860-1900 avg.

1975-2000: 0.5 deg warming, or 0.7 deg above 1860-1900 avg.

2001-2009: constant

So, there were thirty years of warming (1910-1940) 0.5 deg, then a cooling/stable period (1940-1975); then another twenty-five years of warming (1975-2000) 0.5 deg. This is quite different from the popular conception of monotonic, geometrically increasing temperature (i.e., “the hockey stick,” which makes clever choices of start date, zero point, and axis scale). There were actually two periods to date of dramatic global warming: 1910-1940 and 1975-2000. This rudimentary analysis already suggests that anthropogenetic effects on climate are more subtle and complex than a simple greenhouse effect.

Looking at this graph, we are right to be skeptical about Arctic ice cap alarmism. We have only been measuring the Arctic ice cap since 1979, which coincides with the start of the last great warming period. We do not, then, have an adequate long-term view of the cap, and cannot reliably extrapolate this extraordinary warming period into the future. Predictions of a one to three meter sea level rise, displacing hundreds of millions people, such as that infamously made by Susmita Dasgupta in 2009, ought to be put on hold for the moment. An imminent climate disaster is always coming but never arrives. Just read the prophecies of imminent doom, flood and drought from the 1980s, or go back decades earlier to read the predictions of a new ice age.

In order to make the current warming seem even more catastrophic, many climate scientists have seen fit to rewrite the past, going so far as to deny the Medieval Warm Period, which is abundantly well established by the historical record. In their perverse epistemology, contrived abstract models with tenuous suppositions should have greater credibility than recorded contemporary observations. There is no getting around the fact that Greenland and Britain had much more inhabitable and arable land than at present. A more plausible hypothesis is that the Medieval Warm Period was not global, yet this too is increasingly refuted by paleoclimatic evidence.

I find that the importance of observed global warming is diminished, not enhanced, by excessive alarmism and overreaching. Short of an Arctic collapse – improbable, since the ice cap is much more advanced now than in the medieval period – we are left with the projection that global temperature may increase as much as a full degree C by 2100, with a probable sea level increase of about a foot: bad, but not cataclysmic. Of course, that doesn’t get you in the top journals or on government panels.

I’ve noticed that in physics, where things really are established to high degrees of certainty and precision, no one ever says “the science is overwhelming”. It is only when the science is not so “hard,” as in the life, earth, and social sciences, that you need to make repeated appeals to authority. When the science truly is overwhelming, the evidence speaks for itself, and no polls need to be taken.

Update – 14 Oct 2010

One of the last of the WWII-era physicists, Harold “Hal” Lewis of UC-Santa Barbara, has raised an animated discussion with his letter of resignation to the American Physical Society, contending that the APS has been complicit in the “pseudoscientific fraud” of global warming, and the Society has suppressed attempts at open discussion. He invokes the ClimateGate e-mails as evidence of fraud, and attributes the lack of critical discourse to the abundance of funding opportunities for research supporting the global warming hypothesis.

The Chimera of Life on Mars

Last November, NASA scientists rehashed an old claim – no, not that the polar cap will disappear in four years – but that there is evidence of life on Mars on an asteroid discovered in 1984. Further clarification has been offered over the weekend. Behind all the hullaballoo over how these “biomorphs” were found, one inconvenient fact is ignored: they are too small to be living things.

Back in the nineties, “nanobacteria” were theoretical self-replicating calcifying particles believed to be necessary to account for certain bacterial functions. At the time the first “life on Mars” announcement was made in 1996, it was still believed that nanobacteria could be true lifeforms. The shape and chemical composition of formations on the ALH84001 asteroid were similar to putatively identified nanobacteria of Earth. However, a geological process was suggested to explain the formations, so the nanobacterial explanation of the asteroid subsided in public discourse.

It should be emphasized that “nanobacteria”, contrary to their name, are not bacteria, or even lifeforms at all. At 20-100 nm across, they are too small to have nucleic acids (DNA or RNA) or conduct even the most basic bacterial biochemistry. The smallest known bacterium, by contrast, is well over 300 nm across, and it is physically impossible for a true lifeform (with nucleic acid, ribosome, protein) to be smaller than 200 nm. All known nanobacteria have no organelles of any sort; they are just globules of carbonate compounds. To call them lifeforms is a serious abuse of language.

The death knell for nanobacteria was sounded in a series of papers in 2007, 2008 , and 2009 proving that putative “nanobacteria” were just carbonate nanoparticles. They may replicate, as do many crystals and polymers, but they have no nutritive or genetic functions, so they can hardly be considered life unless we lower the bar considerably.

Ironically, although the star of nanobacteria had by then faded, in 2009 NASA published evidence that the observed iron sulfide and iron oxide grains on the 1984 asteroid were not the result of a geological process, but were likely caused by nanobacteria activity. The NASA scientists, followed by the press, proclaimed this as evidence of life on Mars, even though by now it was understood that nanobacteria are not lifeforms in any standard sense of the term. Some reports even referred to nanobacteria as tiny bacteria, as if the only difference were one of size, when in fact the inaptly named “nanobacteria” are not bacteria at all. The lay reader could be excused for not knowing that this “life from Mars” is too small and structureless to have DNA or RNA or any internal metabolism.

What the new evidence really shows at most is that the grains on the Mars asteroid really were produced by nanobacteria and not by some other geological process, so the globule formations on the asteroid are likely to be fossil nanobacteria. This identification however does not have any bearing on the microbiological question of whether nanobacteria themselves are living things, a question that is increasingly being answered in the negative as the crystallization processes that form these nanoparticles become better understood.

It is certainly possible that so-called nanobacteria are a precursor to life, just as water, as far as we know, is a necessary condition for life. Yet a necessary condition is not a sufficient condition, so the presence of carbonate nanoparticles is a far cry from proof that Mars ever developed life at some point, just as the presence of water is no proof that Mars was ever habitable. I may need wood to build a house, but the presence of wood in a forest is no proof that there were ever houses there. Such is the vast gap between nanoparticles and the simplest bacterium. Abiogenesis, it should be noted, remains pure speculation, without any empirical verification or well-defined theoretical mechanism.

The real reason to tart up the evidence for life on Mars is to justify the boondoggle of our manned space program, which is still struggling to develop a vehicle to return to the Moon. There is practically no real science that can be done with manned missions that cannot be done with robots, except studies on the effects of microgravity on humans. The idea that there might be other life out there is what gives the whole concept of manned space exploration some credible purpose. Otherwise, we’re just exploring rocks and gas giants that could just as easily be observed by probes.

While one should never say never, the Fermi paradox is looking as strong as ever, and we may at some point be forced to admit that life is so rare in the universe that we are unlikely to encounter it in any form for thousands of years, if ever. Our efforts might be better spent, then, in tending to our own world, while others perhaps tend to theirs.

Update – 29 Apr 2010

In a case of “be careful what you wish for,” the Obama administration has effectively killed the U.S. manned space program, canceling the Constellation program without any alternative, while the shuttles will be grounded after this year. Responding to a negative backlash, the politically adroit president backpedaled, making a vague promise (not a concrete mission) of an eventual trip to Mars, and partially reactivating the Orion capsule program as an unmanned payload and escape capsule. These are purely face-saving measures. The President does not even have to decide on a heavy lift design until 2015, which means there will likely be at least a full decade when the U.S. does not have manned space flight capability, assuming the program isn’t postponed or killed along the way. In the meantime, we can expect an exodus of aerospace talent to Europe, Russia and China.

Taking a page from George W. Bush, Obama has sought cover from the private sector, promising that companies will be able to take astronauts into low earth orbit for ISS missions. In fact, they are nowhere near such capability. There is a vast technological gap between launching manned and unmanned spacecraft, and between orbital and suborbital flights. Only one company, SpaceX, has successfully achieved suborbital manned space flight. Their “spacecraft” are actually rocket-powered aircraft that exceed Mach 3. It would require more than 60 times the energy to reach orbital speed, not to mention the formidable engineering difficulties of space navigation, docking, life support, and re-entry. For every SpaceX, there are a dozen failed space ventures, yet this is supposed to be our most cost-effective means of returning to LEO.

Make no mistake: Obama is not a fool. He simply is not terribly interested in the manned space program, as is proved by his actions. Nor can he expect sound advice from his space policy advisor Lori Garver, a career policy wonk with zero engineering experience and a fetish for “climate change” pseudoscience. It is true that much good science can be done more cost-effectively with unmanned probes, as I myself have argued above. But let us have no illusions that the scuttling of the manned space program is actually an intelligent plan to improve it. More than a few scientists, who would have derided Bush as an anti-intellectual giving handouts to companies had he done this, are falling over themselves to buy into Obama’s promises of next-generation lift technology to Mars, despite losing our capability of this-generation lift technology to the Moon. Yet I have been exposed to the academic community long enough to know that liberal sympathies trump logic, notwithstanding their pretensions of scientific objectivity.