Daniel J. Denis

Applied Univariate, Bivariate, and Multivariate Statistics


Скачать книгу

of intelligence. Rather, it was simply an arbitrary point on the IQ scale presumably denoting a particular quantity of IQ (even if, in all probability, very small).

      What gives us license to make statements of ratios? The element of the ratio scale that permits us to make such statements is the fact that the ratio scale has at its origin a true zero point. When something is deemed measurable at the ratio scale, a measurement of zero actually means zero of the thing that is being measured. Was this fact true of the interval scale? No, because zero degrees Fahrenheit did not equate to there being zero temperature. “Zero” was simply an arbitrary value on the scale. However, the fact that I have zero coins in my pocket actually means that I have zero coins. “Zero” is said to be, in this case, “absolute,” meaning that there is truly nothing there.

      When we speak of a mathematical variable (or simply, variable), we mean a symbol that at any point could be replaced by values contained in a specified set. For instance, consider the mathematical variable yi. By the subscript i is indicated the fact that yi stands for a set of values, not all equal to the same number (otherwise y would be a constant) such that at any point in time any of these values in the set could serve as a temporary “replacement” for the symbol.

      Of course, social and natural sciences are all about variables. Here are some examples:

       Height of persons in the world is a variable because persons of the world have different heights. However, height would be considered a constant if 10 people in a room were of the exact same height (and those were the only people we were considering).

       Blood pressure is a variable because persons, animals, and other living creatures have different blood pressure measurements.

       Intelligence (IQ) of human beings (difficult to measure to be sure, though psychology has developed instruments in an attempt to assess such things) is a variable because presumably people have differing intellectual capacities.

       Earned run average (ERA) of baseball players is a variable because players do not all have the same ERA.

      A random variable is a mathematical variable that is associated with a probability distribution. That is, as soon as we assign probabilities to values of the variable, we have a random variable. More formally, we can say that a random variable is a function from a sample space into the real numbers (Casella and Berger, 2002), which essentially means that elements in the set (i.e., sample space) have probabilities associated with them (Dowdy, Wearden, and Chilko, 2004).

Mathematical Variable yi Random Variable yi
y 1 = 1 y 1 = 1 (p = 0.20)
y 2 = 3 y 2 = 3 (p = 0.50)
y 3 = 5 y 3 = 5 (p = 0.30)

      The distinction between mathematical and random variables is important when we discuss such things as means, variances, and covariances. A reader first learning about random variables, having already mastered the concept of sample or population variance (to be discussed shortly), can be somewhat taken aback when encountering the variance of a random variable, given as

equation

      and then attempting to compare it to the more familiar variance of a population:

equation

      Realize, however, that both expressions are essentially similar, they both account for squared deviations from the mean. However, the variance of a random variable is stated in terms of its expectation, E. Throughout this book, we will see the operator E at work. What is an expectation? The expectation E of a random variable is the mean of that random variable, which amounts to it being a probability‐weighted average (Gill, 2006). The operator E occurs several times throughout this book because in theoretical statistics, long‐run averages of a statistic are of especial interest. As noted by Feller (1968, p. 221), should an experiment be repeated n times under identical conditions, the average of such trials should be close to expectation. Perhaps less formally, the operator E then tells us what we might expect to see in the long run for large n. Theoretical statisticians love taking expectations, because the short run of a variable is seldom of interest at a theoretical level. It is the long (probability) run that is often of most theoretical interest. As a crude analogy, on a personal level, you may be “up” or “down” now, but if your expectation E pointed to a favorable long‐run endpoint, then perhaps that is enough to convince you that though “on the way” is a rough tumbly road, in the end, as the spiritual would say, we “arrive” at our expectation (which perhaps some would denote as an afterlife of sorts).

      The key point is that when we are working with expectations, we are working with probabilities. Thus, instead of summing squared deviations of the kind images as one does in the sample or population variance for which there is specified n, one must rather assign to these squared deviations probabilities, which is what is essentially being communicated by the notation “E(yiμ)2.” We can “unpack” this expression to read