How to Interrogate a Statistic

from Chapter 12 of The Reasonable Woman (Prometheus Books, 1998)

by Wendy McElroy

In his book "How to Lie with Statistics," Darrell Huff suggests testing the statistics you encounter by asking five simple questions: Who says so? How does he know? What's missing? Did somebody change the subject? Does it make sense?

Who Says So?

The purpose of this question is to seek out possible bias on the part of the researcher or agency offering a statistic. The bias may be conscious. For example, a tobacco manufacturer may support research and may issue reports that support claims favorable to the product it markets. Although the manufacturer's clear bias should make you suspicious, the mere fact that it has a vested interest does not invalidate its statistics. But the bias does mean you should carefully examine the evidence and hold it up to the light.

Unconscious bias may also exist. Even an honest researcher carries all her assumptions with her into the research, and this will greatly influence key factors such as the phrasing of the questions that are asked. For example, a Freudian psychiatrist who accepts the Oedipal complex as being pivotal in psychological development will ask a patient different questions than a Skinnerian who believes behavior modification explains a child's evolution. The data each receives will reflect the bias with which it was gathered.

Look for the bias.

How Does He Know?

The purpose of this question is to ferret out sloppiness or bias in the research process as opposed to bias in the researcher herself. For example, imagine a researcher who rings doorbells at random in order to ask the occupant, "Have you committed a crime lately?" or even the much milder question, "Do you fart often?" The researcher is likely to discover that her sampling is both crime and fart-free, not necessarily because this finding is accurate but because few people will look a stranger in the eye and admit either to committing a crime of to performing a socially stigmatized act.

Perhaps the most common methodological mistake is to rely upon an unrepresentative sampling. Consider the notorious 1936 survey conducted by the then popular "Literary Digest." In order to predict which candidate would win the 1936 presidential election, millions of people were polled, using phone calls and subscription lists of the magazine. The results were overwhelming: The Republican candidate Alf Landon would garner almost three times the vote total of the Democratic candidate Roosevelt. Of course, Roosevelt won. The suvey had been biased by the fact that, in 1936, people who could afford telephones and subscriptions to "Literary Digest" were the economically advantaged, who also tended to be Republican.

What's Missing?

Many calculations are useless without knowing their context. "Seventy-five percent of American prefer milk to lemonade" is an impressive statistic until you realize that only twelve people were sampled, all of whom were Wisconsin dairy farmers. At that point, the surprising stat is that twenty-five percent preferred lemonade.

Whenever you note that a small group of people have been surveyed, your suspicions should be roused, and not only because of the statistical
 insignificance of the sampling. It is quite possible, and perhaps quite common for some researchers to conduct a multitude of small surveys until one of them produces the desired results. Huff reports an experiment he conducted: after numerous tries at tossing a penny ten times in a row, one of the attempts produced eight heads and two tails. On the basis of this survey, he concluded that a tossed penny come up heads eighty percent of the time.

Did Someone Change the Subject?

A newscaster states that "reports of domestic violence have increased" and concludes that "domestic violence is on the rise." This conclusion is not justified, however, because the increased reporting may reflect nothing more than a greater willingness on the part of women to contact the police or a greater willingness on the part of police to file the reports. Domestic violence may, in fact, be decreasing. The newscaster has changed the subject from increased reporting to increased incidents.

Does It Make Sense?

Never allow a statistical finding to automatically override your common sense. My husband taught me how to estimate. The technique involves taking a statistic to its logical conclusion and seeing if it reduces to absurdity. Consider the alarming statement, "over 3,000,000 teenage girls on welfare became pregnant this year."

Start with the total population of the U.S. -- roughly 300,000,000 people.  Assume that roughly half are male, leaving you with 160,000,000 which reflects the greater longevity of women. Assume the age spread of women to be one to seventy-five years, of which the teenage years (13-19) constitute approximately 10.7%, or 17,120,000. Generously assume that every teenage girl can become pregnant. Divide this figure by the reportedly 3,000,000 pregnant welfare teens and the number you get is 5.7. In other words, according to the statistic first quoted, one in every six teenage girls is not only on welfare but has also become pregnant in the last year.

Does this statistic make sense to you? Does it match other numbers coming out on welfare recipients?

Always trust your common sense.