all heard people say, “You can prove anything with statistics.”* My title, Damned Lies and Statistics, comes from a famous aphorism (usually attributed to Mark Twain or Benjamin Disraeli): “There are lies, damned lies, and statistics.”2 There is even a useful little book, still in print after more than forty years, called How to Lie with Statistics.3
Statistics, then, have a bad reputation. We suspect that statistics may be wrong, that people who use statistics may be “lying”—trying to manipulate us by using numbers to somehow distort the truth. Yet, at the same time, we need statistics; we depend upon them to summarize and clarify the nature of our complex society. This is particularly true when we talk about social problems. Debates about social problems routinely raise questions that demand statistical answers: Is the problem widespread? How many people—and which people—does it affect? Is it getting worse? What does it cost society? What will it cost to deal with it? Convincing answers to such questions demand evidence, and that usually means numbers, measurements, statistics.
But can’t you prove anything with statistics? It depends on what “prove” means. If we want to know, say, how many children are “gunned down” each year, we can’t simply guess—pluck a number from thin air: one hundred, one thousand, ten thousand, 35 trillion, whatever. Obviously, there’s no reason to consider an arbitrary guess “proof” of anything. However, it might be possible for someone—using records kept by police departments or hospital emergency rooms or coroners—to keep track of children who have been shot; compiling careful, complete records might give us a fairly accurate idea of the number of gunned-down children. If that number seems accurate enough, we might consider it very strong evidence—or proof.
The solution to the problem of bad statistics is not to ignore all statistics, or to assume that every number is false. Some statistics are bad, but others are pretty good, and we need statistics—good statistics—to talk sensibly about social problems. The solution, then, is not to give up on statistics, but to become better judges of the numbers we encounter. We need to think critically about statistics—at least critically enough to suspect that the number of children gunned down hasn’t been doubling each year since 1950.
A few years ago, the mathematician John Allen Paulos wrote Innumeracy, a short, readable book about “mathematical illiteracy.”4 Too few people, he argued, are comfortable with basic mathematical principles, and this makes them poor judges of the numbers they encounter. No doubt this is one reason we have so many bad statistics. But there are other reasons, as well.
Social statistics describe society, but they are also products of our social arrangements. The people who bring social statistics to our attention have reasons for doing so; they inevitably want something, just as reporters and the other media figures who repeat and publicize statistics have their own goals. Statistics are tools, used for particular purposes. Thinking critically about statistics requires understanding their place in society.
While we may be more suspicious of statistics presented by people with whom we disagree—people who favor different political parties or have different beliefs—bad statistics are used to promote all sorts of causes. Bad statistics come from conservatives on the political right and liberals on the left, from wealthy corporations and powerful government agencies, and from advocates of the poor and the powerless. In this book, I have tried to choose examples that show this range: I have selected some bad statistics used to justify causes I support, as well as others offered to promote causes I oppose. I hope that you and everyone else who reads this book will find at least one discomforting example of a bad statistic presented in behalf of a cause you support. Honesty requires that we recognize our own errors in reasoning, as well as those of our opponents.
This book can help you understand the uses of social statistics and make you better able to judge the statistics you encounter. Understanding this book will not require sophisticated mathematical knowledge. We will be talking about the most basic forms of statistics: percentages, averages, and rates—what statisticians call “descriptive statistics.” These are the sorts of statistics typically addressed in the first week or so of an introductory statistics course. (The remainder of that course, like all more advanced courses in statistics, covers “inferential statistics,” complex forms of reasoning that we will ignore.) This book can help you evaluate the numbers you hear on the evening news, rather than the statistical tables printed in the American Sociological Review and other scholarly journals. Our goal is to learn to recognize the signs of really bad statistics, so that we won’t believe—let alone repeat—claims about the number of murdered children doubling each year.
_________
*For reasons that will become obvious, I have decided not to name the Graduate Student, the Author, or the Journal Editor. They made mistakes, but the mistakes they made were, as this book will show, all too common.
*For instance, since only child victims are at issue, a careful analysis would control for the relative sizes of the child population in the two years. We also ought to have assurances that the methods of counting child gunshot victims did not change over time, and so on.
*This is a criticism with a long history. In his book Chartism, published in 1840, the social critic Thomas Carlyle noted: “A witty statesman said you might prove anything with figures.”
1
THE IMPORTANCE OF SOCIAL STATISTICS
Nineteenth-century Americans worried about prostitution; reformers called it “the social evil” and warned that many women prostituted themselves. How many? For New York City alone, there were dozens of estimates: in 1833, for instance, reformers published a report declaring that there were “not less than 10,000” prostitutes in New York (equivalent to about 10 percent of the city’s female population); in 1866, New York’s Methodist bishop claimed there were more prostitutes (11,000 to 12,000) than Methodists in the city; other estimates for the period ranged as high as 50,000. These reformers hoped that their reports of widespread prostitution would prod the authorities to act, but city officials’ most common response was to challenge the reformers’ numbers. Various investigations by the police and grand juries produced their own, much lower estimates; for instance, one 1872 police report counted only 1,223 prostitutes (by that time, New York’s population included nearly half a million females). Historians see a clear pattern in these cycles of competing statistics: ministers and reformers “tended to inflate statistics”;1 while “police officials tended to underestimate prostitution.”2
Antiprostitution reformers tried to use big numbers to arouse public outrage. Big numbers meant there was a big problem: if New York had tens of thousands of prostitutes, something ought to be done. In response, the police countered that there were relatively few prostitutes—an indication that they were doing a good job. These dueling statistics resemble other, more recent debates. During Ronald Reagan’s presidency, for example, activists claimed that three million Americans were homeless, while the Reagan administration insisted that the actual number of homeless people was closer to 300,000, one-tenth what the activists claimed. In other words, homeless activists argued that homelessness was a big problem that demanded additional government social programs, while the administration argued new programs were not needed to deal with what was actually a much smaller, more manageable problem. Each side presented statistics that justified its policy recommendations, and each criticized the other’s numbers. The activists ridiculed the administration’s figures as an attempt to cover up a large, visible problem, while the adminstration insisted that the activists’ numbers were unrealistic exaggerations.3
Statistics, then, can become weapons in political struggles over social problems and social policy. Advocates of different positions use numbers to make their points (“It’s a big problem!” “No, it’s not!”). And, as the example of nineteenth-century estimates of prostitution reminds