Studies Show Most Do Not Understand Statistics

In this election season, repeated citations of polls provide reminders of how little even most educated people understand about statistics. I should like to review a few basic errors that cause most people to overvalue the accuracy of polls and other studies based on statistical samplings and correlations.

Journalistic polls often state a “sampling error” of 3 or 4 percentage points. This sampling error is a measure of the statistical error resulting from taking a sample of several hundred or several thousand random people out of the entire population represented. It does not include other sources of error, such as systemic sampling bias resulting from favoring, say, urban over rural respondents, women over men, etc. Thus the total error of a poll is usually more than the stated sampling error. This is why voter exit polls turn out to be inaccurate more frequently than their sampling error would indicate. If the error were truly 3 percent, we would expect the poll to be accurate within that margin of error two-thirds of the time, following a normal distribution.

Understatement of the error is also common in economics. Recently, former Treasury Secretary Robert Rubin opined that the current financial crisis was a “low probability” event, following conventional economic models. However, as Benoit Mandelbrot has pointed out, conventional economic models of asset valuations substantially underestimate risk, since they assume a normal Gaussian distribution of variations in price when a Cauchy distribution would be more accurate. Higher mathematics aside, we could gather as much when we consider that “low probability” events occur with remarkable regularity and frequency. Rubin’s understatement of error in his economic model leads to a tragic failure to appreciate that there may be systemic reasons for our propensity for bubbles and busts; instead, he regards the crisis as a freak occurrence.

Worse still is when polls are advanced to support claims for which they may have little relevance. Telling us that a majority of economists support Candidate X is not an economic argument for Candidate X, any more than a majority of physicists supporting Candidate X would prove the candidate is good for physics. If anything, it tells us about the political affiliations of economists or physicists, which is sociological data, not a scientific argument. Hard science does not work by taking polls of scientists, but demands that reasons be produced for a position.

Medical studies are often interpreted by journalists to prove causality when they only show statistical correlation. A good rule of thumb is to never assume causality unless a clear aetiology can be shown. Here, common sense may serve as an adequate substitute for mathematical expertise. When the consensus on medical wisdom constantly changes in a matter of decades, we can be sure that the facts were never as firmly established as originally claimed. Medical studies understate their errors by failing to take into account measurement error and systemic error in their statistical analysis. Further, they usually show correlations or “risk factors” without demonstrating causality. For these reasons, the certitude of medical wisdom should be viewed skeptically. Lastly, the claim “there is no evidence that X is dangerous” can simply mean no adequate study of the matter has been done.

In all these cases, a healthy skepticism combined with common sense can guard against most statistical fallacies, even when mathematical sophistication is lacking. Mathematics, after all, is wholly derived from intuitive, rational principles, so it cannot yield absurd results. When a presentation of statistical results seems completely contrary to reality, it is usually a safe inference that there is a wrong assumption underlying the analysis. Even sophisticated statisticians can err, though they calculate impeccably, if they misconstrue the assumptions or conditions of the question they believe they are answering. When studies claiming 90 or 95 percent accuracy prove to be inaccurate more than 10 percent of the time, it doesn’t take a mathematician to realize that there is a lot of overclaiming in the soft sciences.

Update: 29 December 2008

To give a current example of misleading statistics, a new study claims that teens making abstinence pledges are no less likely to have premarital sex than those who do not. If that sounds counterintuitive, it is because it is not true. Pledgers indeed are less likely to fornicate, but the current study decided to control for factors such as conservatism, religion, and attitudes about sex, and compared pledgers and non-pledgers with similar characteristics. Unsurprisingly, this yielded no difference, since the pledge itself does not magically cause abstinence, but rather the underlying attitudes and values are the cause. This is a far cry from showing abstinence programs are ineffective. It would be like saying education is ineffective, but rather it is knowledge that changes behavior. Once again, competent scientists blinded by their biases can make inapt choices of groups to compare, and make interpretations that do not follow.

Another Blow for Cholesterol Drugs

Merck & Co. has taken a large hit with the rejection of its latest cholesterol drug, as the FDA finally takes a stand against such useless medications. Although the U.S. Food & Drug Administration, like the medical establishment, has by no means abandoned the untenable hypothesis that cholesterol causes heart disease, they at least recognize that a drug ought to be judged on its clinical outcomes rather than its ability to change numbers on lab tests.

It has long been known that many cholesterol drugs are of limited or negligible effectiveness. Statins are not effective at primary prevention and have a host of serious side effects. Fibrates are clinically worthless, having no impact on clinical outcomes, though they do successfully lower LDL, “bad cholesterol,” and in the case of fibrates, raise HDL, “good cholesterol”. This fact alone might cause rational people to see problems with the cholesterol hypothesis as a cause of heart disease. Instead, it is more likely that high LDL is an indicator of other factors, such as lack of exercise, that are also linked to heart disease risk. Unfortunately, the distinction between correlation and causation is often confused in medicine, and the misleading term “risk factor,” which refers only to statistical correlation, does much to confuse lay persons.

The lack of substantial positive clinical outcomes from cholesterol drugs ought to be weighed against the negative side effects, including weakening of the nervous system and Co-Q10 depletion by statins, and increased risk of liver disease from fibrates. The supposed benefit of these drugs is to alter cholesterol levels, which may or may not reflect an underlying unhealthy condition. High LDL is positively correlated to obesity, smoking, lack of exercise, excessive alcohol consumption and other unhealthy conditions. Yet it would be a mistake to consider LDL “bad cholesterol” as it is essential to the formation of hormones. Indeed, cholesterol that is too low can also be dangerous, yet there is no recommended minimum LDL level, only a maximum.

Far more devastating to our health than the false equation between serum cholesterol and heart disease is the fictitious link between blood serum cholesterol and dietary cholesterol, particularly saturated fats. There has been no proven link between dietary cholesterol and blood cholesterol; in fact, most blood cholesterol is produced by the body. Further, hundreds of tribes throughout the world, known for the near total absence of heart disease among them, eat food rich in fat, especially game meat. This hunter-gatherer diet should be proof positive against any simplistic correlation between dietary fat and heart disease. In many Western countries, animal meat is unhealthy for other reasons, such as the fact that cattle are not grass-fed, and thus have all the unhealthy traits of a high-carbohydrate diet.

This leads to the core problem of heart health in modern societies: the consumption of refined sugars and other carbohydrates. These, combined with hydrogenated oils and other processed oils, are far greater health dangers than the unfairly maligned red meats and saturated fats. Indeed, much of the nutritional advice proffered in the U.S. over the last fifty years has been the exact opposite of what would promote heart health, which is one reason age-adjusted incidence of heart disease has increased.

The phenomenon of cholesterol drugs is but one aspect of the greater problem of Western medicine, which regards the human body as a mechanical composition of chemicals, and tries to crudely tinker with certain quantities, often valuing these quantities more than clinical outcomes. This modality has the effect of emphasizing the use of drugs and surgical remedies rather than effective preventive measures. Ironically, such preventive measures would include rejecting some of the technology-laden aspects of food production that have ignored potential negative effects on the human organism.

The Regulation of Trans Fats

New York City has decided to ban trans fats from being served in restaurants, prompting the usual libertarian argument that this limits consumer choice, as if any consumer would choose trans fats if given a real choice. Trans fats are a serious health liability and add absolutely nothing to flavor. They simply extend shelf-life, so they are a benefit to the producer and the retailer, not the consumer. The consumer’s health is the collateral damage resulting from the manufacturer’s desire to maximize profit. At best, the consumer may benefit indirectly from a slight reduction in the price of goods, but this variation in retail value has been found to be negligible.

Libertarians would have us recoil in horror from the “nanny state” preventing restaurants from serving trans fats, as if this were an affront to liberty, but instead would allow businesses to poison their customers (who never know the trans fat content of the food served) as if this were a sign of freedom. When consumers have no knowledge or control over the content of their food, it is difficult to see how they are acting freely. Given the opportunity, many businesses will poison their customers to the maximum extent permitted by law, which is why the FDA came into existence in the first place. Far from being advocates of freedom, the libertarians would make us slaves to the whims of unscrupulous businesses who would hydrogenate harmless fats into killer fats in order to maximize shelf-life. This is but a minor example of the greater fallacy of libertarianism: that government regulation is evil, but the same level of coercion from business is good. While the tyranny of the state is to be feared, it is no greater freedom to be at the mercy of private enterprise.