Joseph Schmuller

Statistical Analysis with Excel For Dummies


Скачать книгу

heads-tails split in the data is consistent with a fair coin. Think of it as the idea that nothing in the results of the study is out of the ordinary.

      An alternative hypothesis is possible: The coin isn’t a fair one, and it's loaded to produce an unequal number of heads and tails. This hypothesis says that any heads-tails split is consistent with an unfair coin. The alternative hypothesis is called, believe it or not, the alternative hypothesis. The statistical notation for the alternative hypothesis is H1.

      

Notice that I did not say “accept H0.” The way the logic works, you never accept a hypothesis. You either reject H0 or don't reject H0.

      Here’s a real-world example to help you understand this idea. Whenever a defendant goes on trial, that person is presumed innocent until proven guilty. Think of innocent as H0. The prosecutor’s job is to convince the jury to reject H0. If the jurors reject, the verdict is guilty. If they don’t reject, the verdict is not guilty. The verdict is never innocent. That would be like accepting H0.

      Back to the coin tossing example. Remember I said “around 50 heads and 50 tails” is what you could expect from 100 tosses of a fair coin. What does around mean? Also, I said if it’s 90-10, reject H0. What about 85-15? 80-20? 70-30? Exactly how much different from 50-50 does the split have to be for you to reject H0? In the reading speed example, how much greater does the improvement have to be to reject H0?

      I don't answer these questions now. Statisticians have formulated decision rules for situations like this, and you explore those rules throughout the book.

      Two types of error

      Whenever you evaluate the data from a study and decide to reject H0 or to not reject H0, you can never be absolutely sure. You never really know what the true state of the world is. In the context of the coin tossing example, that means you never know for certain if the coin is fair or not. All you can do is make a decision based on the sample data you gather. If you want to be certain about the coin, you'd have to have the data for the entire population of tosses — which means you'd have to keep tossing the coin until the end of time.

      Because you’re never certain about your decisions, it’s possible to make an error regardless of what you decide. As I mention earlier in this chapter, the coin could be fair and you just happen to get 99 heads in 100 tosses. That’s not likely, and that’s why you reject H0. It’s also possible that the coin is biased, yet you just happen to toss 50 heads in 100 tosses. Again, that’s not likely and you don’t reject H0 in that case.

      If you reject H0 and you shouldn't, that's a Type I error. In the coin example, that's rejecting the hypothesis that the coin is fair, when in reality it’s a fair coin.

      If you don't reject H0 and you should have, that's a Type II error. That happens if you don't reject the hypothesis that the coin is fair and in reality it's biased.

      How do you know if you've made either type of error? You don't — at least not right after you make your decision to reject or not reject H0. (If it's possible to know, you wouldn't make the error in the first place!) All you can do is gather more data and see if the additional data are consistent with your decision.

      If you think of H0 as a tendency to maintain the status quo and not interpret anything as being out of the ordinary (no matter how it looks), a Type II error means you missed out on something big. Looked at in that way, Type II errors form the basis of many historical ironies.

      Here’s what I mean: In the 1950s, a particular TV show gave talented young entertainers a few minutes to perform on stage and a chance to compete for a prize. The audience voted to determine the winner. The producers held auditions around the country to find people for the show. Many years after the show went off the air, the producer was interviewed. The interviewer asked him if he had ever turned down anyone at an audition whom he shouldn’t have.

      “Well,” said the producer, “once a young singer auditioned for us and he seemed really odd.”

      “In what way?” asked the interviewer.

      “In a couple of ways,” said the producer. “He sang really loud, gyrated his body and his legs when he played the guitar, and he had these long sideburns. We figured this kid would never make it in show business, so we thanked him for showing up, but we sent him on his way.”

      “Wait a minute — are you telling me you turned down …?”

      “That's right. We actually said no … to Elvis Presley!”

      Now that's a Type II error.

      A chapter on data evaluation might seem an odd place to talk about Excel fundamentals. This section and the next one help you get started with the statistical work that begins in Chapter 2 and continues throughout the book.

Snapshot of the Excel interface in Windows.

      FIGURE 1-2: The Excel interface in Windows.

      Microsoft has developed shorthand for describing a mouse-click on a command button that lives on a tab on the Ribbon, and I use that shorthand throughout this book. The shorthand is

      Tab | Command Button

      To indicate clicking on the Insert tab’s Recommended Charts category button, for example, I write

      Insert | Recommended Charts

Snapshot of Clicking Insert | Recommended Charts opens this box.

      FIGURE 1-3: Clicking Insert | Recommended Charts opens this box.