Joel Best

Damned Lies and Statistics


Скачать книгу

McCready, Jacquelynn Nicnick, Meghan Shaw, Barbara Sweeney, Denise Weaver, Kelly Wesstrom, and Melissa Zwickel read the entire manuscript and gave me valuable feedback. Writing a critique of common errors in reasoning is a tricky business: it invites readers to find the mistakes in these pages. It would be nice to be able to blame scapegoats but, alas, the friends who commented on the manuscript gave me good advice—although I didn’t always take it. The flaws are mine.

      PREFACE TO THE UPDATED EDITION

      Darrell Huff’s little book, How to Lie with Statistics, made a bigger impression on me than anything else I read during my first year in college.1 It wasn’t even assigned reading; the TA in my statistics lab mentioned it in passing, the title struck me as amusing, and I borrowed the book from the campus library. It was a great read: by cataloging basic forms of statistical malpractice, Huff gave me a set of critical tools I could apply when reading news stories.

      As the years went by, it became clear to me that the errors Huff had exposed remained alive and well. Sometime in the early 1990s, I reread How to Lie with Statistics. This time, I was less impressed. While Huff still offered a terrific introduction to the topic, I realized that he’d barely scratched the surface. I started thinking about writing a book of my own, one that provided a more comprehensive, more sociological approach. Damned Lies and Statistics was the result.

      Sociology professors get used to writing for other sociologists. The chief pleasure of having written this book has been discovering the broad range of people who have read it and told me what they thought—professors and students, of course, but all sorts of folks outside academia—journalists, activists, math teachers, judges, doctors, even a mom who’d assigned it to her homeschooled child. Lots of people have found the topic interesting, and I continue to get email messages drawing my attention to particularly dubious numbers.

      And there is no shortage of questionable numbers. This version of the book contains a new afterword that tries to explain why, even if we all agree that people ought to think more critically about the figures that inform our public debates, we seem unable to drive bad statistics out of the marketplace of ideas.

      NOTES

      1. Darrell Huff, How to Lie with Statistics (New York: Norton, 1954). For a symposium on Huff’s book, see the special section “How to Lie with Statistics Turns Fifty,” in Statistical Science 20 (2005): 205–60.

      INTRODUCTION

       The Worst Social Statistic Ever

      So the prospectus began with this (carefully footnoted) quotation: “Every year since 1950, the number of American children gunned down has doubled.” I had been invited to serve on the Student’s dissertation committee. When I read the quotation, I assumed the Student had made an error in copying it. I went to the library and looked up the article the Student had cited. There, in the journal’s 1995 volume, was exactly the same sentence: “Every year since 1950, the number of American children gunned down has doubled.”

      This quotation is my nomination for a dubious distinction: I think it may be the worst—that is, the most inaccurate—social statistic ever.

      What makes this statistic so bad? Just for the sake of argument, let’s assume that the “number of American children gunned down” in 1950 was one. If the number doubled each year, there must have been two children gunned down in 1951, four in 1952, eight in 1953, and so on. By 1960, the number would have been 1,024. By 1965, it would have been 32,768 (in 1965, the FBI identified only 9,960 criminal homicides in the entire country, including adult as well as child victims). In 1970, the number would have passed one million; in 1980, one billion (more than four times the total U.S. population in that year). Only three years later, in 1983, the number of American children gunned down would have been 8.6 billion (about twice the Earth’s population at the time). Another milestone would have been passed in 1987, when the number of gunned-down American children (137 billion) would have surpassed the best estimates for the total human population throughout history (110 billion). By 1995, when the article was published, the annual number of victims would have been over 35 trillion—a really big number, of a magnitude you rarely encounter outside economics or astronomy.

      Thus my nomination: estimating the number of American child gunshot victims in 1995 at 35 trillion must be as far off—as hilariously, wildly wrong—as a social statistic can be. (If anyone spots a more inaccurate social statistic, I’d love to hear about it.)

      Where did the article’s Author get this statistic? I wrote the Author, who responded that the statistic came from the Children’s Defense Fund (the CDF is a well-known advocacy group for children). The CDF’s The State of America’s Children Yearbook—1994 does state: “The number of American children killed each year by guns has doubled since 1950.”1 Note the difference in the wording—the CDF claimed there were twice as many deaths in 1994 as in 1950; the article’s Author reworded that claim and created a very different meaning.

      Certainly, the article’s Author didn’t ask many probing, critical questions about the CDF’s claim. Impressed by the statistic, the Author repeated it—well, meant to repeat it. Instead, by rewording the CDF’s claim, the Author created a mutant statistic, one garbled almost beyond recognition.

      But people treat mutant statistics just as they do other statistics—that is, they usually accept even the most implausible claims without question. For example, the Journal Editor who accepted the Author’s article for publication did not bother to consider the implications of child victims doubling each year. And people repeat bad statistics: the Graduate Student copied the garbled statistic and inserted it into the dissertation prospectus. Who knows whether still other readers were impressed by the Author’s statistic and remembered it or repeated it? The article remains on the shelf in hundreds of libraries, available to anyone who needs a dramatic quote. The lesson should be clear: bad statistics live on; they take on lives of their own.

      This is a book about bad statistics, where they come from, and why they won’t go away. Some statistics are born bad—they aren’t much good from the start, because they are based on nothing more than guesses or dubious data. Other statistics mutate; they become bad after being mangled (as in the case of the Author’s creative rewording). Either way, bad statistics are potentially important: they can be used to stir up public outrage or fear; they can distort our understanding of our world; and they can lead us to make poor policy choices.