The first variety is called nominal data. If a number is a piece of nominal data, it’s just a name. Its value doesn't signify anything. A good example is the number on an athlete’s jersey. It’s just a way of identifying the athlete. The number has nothing to do with the athlete’s level of skill.
Next come ordinal data. Ordinal data are all about order, and numbers begin to take on meaning over and above just being identifiers. A higher number indicates the presence of more of a particular attribute than a lower number. One example is the Mohs scale: Used since 1822, it’s a scale whose values are 1 through 10; mineralogists use this scale to rate the hardness of substances. Diamond, rated at 10, is the hardest. Talc, rated at 1, is the softest. A substance that has a given rating can scratch any substance that has a lower rating.
What’s missing from the Mohs scale (and from all ordinal data) is the idea of equal intervals and equal differences. The difference between a hardness of 10 and a hardness of 8 is not the same as the difference between a hardness of 6 and a hardness of 4.
Interval data provide equal differences. Fahrenheit temperatures provide an example of interval data. The difference between 60 degrees and 70 degrees is the same as the difference between 80 degrees and 90 degrees.
Here’s something that might surprise you about Fahrenheit temperatures: A temperature of 100 degrees isn’t twice as hot as a temperature of 50 degrees. For ratio statements (twice as much as, half as much as) to be valid, zero has to mean the complete absence of the attribute you're measuring. A temperature of 0 degrees F doesn’t mean the absence of heat — it’s just an arbitrary point on the Fahrenheit scale.
The last data type, ratio data, includes a meaningful zero point. For temperatures, the Kelvin scale gives ratio data. One hundred degrees Kelvin is twice as hot as 50 degrees Kelvin. This is because the Kelvin zero point is absolute zero, where all molecular motion (the basis of heat) stops. Another example is a ruler. Eight inches is twice as long as four inches. A length of zero means a complete absence of length.
Any of these data types can form the basis of an independent variable or a dependent variable. The analytical tools you use depend on the type of data you’re dealing with.
A little probability
When statisticians make decisions, they express their confidence about those decisions in terms of probability. They can never be certain about what they decide. They can only tell you how probable their conclusions are.
So, what is probability? The best way to attack this is with a few examples. If you toss a coin, what's the probability that it comes up heads? Intuitively, you know that if the coin is fair, you have a 50-50 chance of heads and a 50-50 chance of tails. In terms of the kinds of numbers associated with probability, that’s ½.
How about rolling a die? (That’s one member of a pair of dice.) What’s the probability that you roll a 3? Hmm… . A die has six faces and one of them is 3, so that ought to be 1⁄6, right? Right.
Here’s one more. You have a standard deck of playing cards. You select one card at random. What’s the probability that it’s a club? Well, a deck of cards has four suits, so that answer is ¼.
I think you’re getting the picture. If you want to know the probability that an event occurs, figure out how many ways that event can happen and divide by the total number of events that can happen. In each of the three examples, the event we’re interested in (heads, 3, or club) happens only one way.
Things can get a bit more complicated. When you toss a die, what’s the probability you roll a 3 or a 4? Now you're talking about two ways the event you're interested in can occur, so that's
. What about the probability of rolling an even number? That has to be 2, 4, or 6, and the probability is .On to another kind of probability question. Suppose you roll a die and toss a coin at the same time. What's the probability you roll a 3 and the coin comes up heads? Consider all the possible events that can occur when you roll a die and toss a coin at the same time. The outcome can be a head and 1-6 or a tail and 1-6. That's a total of 12 possibilities. The head-and-3 combination can happen only one way, so the answer is
.In general, the formula for the probability that a particular event occurs is
I begin this section by saying that statisticians express their confidence about their decisions in terms of probability, which is really why I brought up this topic in the first place. This line of thinking leads me to conditional probability — the probability that an event occurs given that some other event occurs. For example, suppose I roll a die, take a look at it (so that you can't see it), and tell you I’ve rolled an even number. What’s the probability that I've rolled a 2? Ordinarily, the probability of a 2 is 1⁄6, but I’ve narrowed the field. I’ve eliminated the three odd numbers (1, 3, and 5) as possibilities. In this case, only the three even numbers (2, 4, and 6) are possible, so now the probability of rolling a 2 is 1⁄3.
Exactly how does conditional probability play into statistical analysis? Read on.
Inferential Statistics: Testing Hypotheses
In advance of doing a study, a statistician draws up a tentative explanation — a hypothesis — of why the data might come out a certain way. After the study is complete and the sample data are all tabulated, the statistician faces the essential decision every statistician has to make: whether or not to reject the hypothesis.
That decision is wrapped in a conditional probability question — what’s the probability of obtaining the sample data, given that this hypothesis is correct? Statistical analysis provides tools to calculate the probability. If the probability turns out to be low, the statistician rejects the hypothesis.
Suppose you’re interested in whether or not a particular coin is fair — whether it has an equal chance of coming up heads or tails. To study this issue, you'd take the coin and toss it a number of times — say, 100. These 100 tosses make up your sample data. Starting from the hypothesis that the coin is fair, you'd expect that the data in your sample of 100 tosses would show around 50 heads and 50 tails.
If it turns out to be 99 heads and 1 tail, you’d undoubtedly reject the fair coin hypothesis. Why? The conditional probability of getting 99 heads and 1 tail given a fair coin is very low. Wait a second. The coin could still be fair and you just happened to get a 99-1 split, right? Absolutely. In fact, you never really know. You have to gather the sample data (the results from 100 tosses) and make a decision. Your decision might be right, or it might not.
Juries face this dilemma all the time. They have to decide among competing hypotheses that explain the evidence in a trial. (Think of the evidence as data.) One hypothesis is that the defendant is guilty. The other is that the defendant is not guilty. Jury members have to consider the evidence and, in effect, answer a conditional probability question: What’s the probability of the evidence given that the defendant is not guilty? The answer to this question determines the verdict.
Null and alternative hypotheses
Consider once again the coin tossing study I mention in the preceding section. The sample data are the results from the 100 tosses. Before tossing the coin, you might start with the hypothesis that the coin is a fair one so that you expect an equal number of heads and tails. This starting point is called the null hypothesis. The statistical notation