In this election season, repeated citations of polls provide reminders of how little even most educated people understand about statistics. I should like to review a few basic errors that cause most people to overvalue the accuracy of polls and other studies based on statistical samplings and correlations.
Journalistic polls often state a “sampling error” of 3 or 4 percentage points. This sampling error is a measure of the statistical error resulting from taking a sample of several hundred or several thousand random people out of the entire population represented. It does not include other sources of error, such as systemic sampling bias resulting from favoring, say, urban over rural respondents, women over men, etc. Thus the total error of a poll is usually more than the stated sampling error. This is why voter exit polls turn out to be inaccurate more frequently than their sampling error would indicate. If the error were truly 3 percent, we would expect the poll to be accurate within that margin of error two-thirds of the time, following a normal distribution.
Understatement of the error is also common in economics. Recently, former Treasury Secretary Robert Rubin opined that the current financial crisis was a “low probability” event, following conventional economic models. However, as Benoit Mandelbrot has pointed out, conventional economic models of asset valuations substantially underestimate risk, since they assume a normal Gaussian distribution of variations in price when a Cauchy distribution would be more accurate. Higher mathematics aside, we could gather as much when we consider that “low probability” events occur with remarkable regularity and frequency. Rubin’s understatement of error in his economic model leads to a tragic failure to appreciate that there may be systemic reasons for our propensity for bubbles and busts; instead, he regards the crisis as a freak occurrence.
Worse still is when polls are advanced to support claims for which they may have little relevance. Telling us that a majority of economists support Candidate X is not an economic argument for Candidate X, any more than a majority of physicists supporting Candidate X would prove the candidate is good for physics. If anything, it tells us about the political affiliations of economists or physicists, which is sociological data, not a scientific argument. Hard science does not work by taking polls of scientists, but demands that reasons be produced for a position.
Medical studies are often interpreted by journalists to prove causality when they only show statistical correlation. A good rule of thumb is to never assume causality unless a clear aetiology can be shown. Here, common sense may serve as an adequate substitute for mathematical expertise. When the consensus on medical wisdom constantly changes in a matter of decades, we can be sure that the facts were never as firmly established as originally claimed. Medical studies understate their errors by failing to take into account measurement error and systemic error in their statistical analysis. Further, they usually show correlations or “risk factors” without demonstrating causality. For these reasons, the certitude of medical wisdom should be viewed skeptically. Lastly, the claim “there is no evidence that X is dangerous” can simply mean no adequate study of the matter has been done.
In all these cases, a healthy skepticism combined with common sense can guard against most statistical fallacies, even when mathematical sophistication is lacking. Mathematics, after all, is wholly derived from intuitive, rational principles, so it cannot yield absurd results. When a presentation of statistical results seems completely contrary to reality, it is usually a safe inference that there is a wrong assumption underlying the analysis. Even sophisticated statisticians can err, though they calculate impeccably, if they misconstrue the assumptions or conditions of the question they believe they are answering. When studies claiming 90 or 95 percent accuracy prove to be inaccurate more than 10 percent of the time, it doesn’t take a mathematician to realize that there is a lot of overclaiming in the soft sciences.
Update: 29 December 2008
To give a current example of misleading statistics, a new study claims that teens making abstinence pledges are no less likely to have premarital sex than those who do not. If that sounds counterintuitive, it is because it is not true. Pledgers indeed are less likely to fornicate, but the current study decided to control for factors such as conservatism, religion, and attitudes about sex, and compared pledgers and non-pledgers with similar characteristics. Unsurprisingly, this yielded no difference, since the pledge itself does not magically cause abstinence, but rather the underlying attitudes and values are the cause. This is a far cry from showing abstinence programs are ineffective. It would be like saying education is ineffective, but rather it is knowledge that changes behavior. Once again, competent scientists blinded by their biases can make inapt choices of groups to compare, and make interpretations that do not follow.