that we can assume that we are measuring a continuous attribute with a somewhat faulty instrument in which the measurement error varies slightly across the range of values, as if we were measuring lengths with a metric tape in which the marks were erased in some sections so we have to take an approximate reading in those sections. In such a case, it would appear that the attribute had been measured in an ordinal scale while it has actually been measured in an interval scale. This is why we often see data obtained with some clinical questionnaires presented and analyzed as if it were interval data.
The last central tendency measure is the mode. The mode is simply the most common value of an attribute. It has the advantage over the other measures of central tendency that it can be used with all types of scales of measurement, including categorical scales. The mode, however, has many disadvantages and this is why it is seldom used in clinical research. One important problem with the mode is that there is no guarantee that it is a unique value. The most frequent values of an attribute may occur with the same frequency, and then we will have several modes. In addition, in very small samples, each value may occur only once and we will have as many modes as values. No further mention will be made of the mode throughout this book.
So far, we have been considering attributes measured in interval or ordinal scales. However, we are often interested in attributes that may be characterized only by their presence or absence (e.g. family history of asthma) or that classify subjects into two groups (e.g. males and females, death and survival).
As we saw in Section 1.2, attributes taking only two values are called binary attributes. They represent the most elementary type of measurement and, therefore, convey the smallest amount of information. It is useful to think of binary attributes as attributes that may be on or off, because then the above distinction is not necessary. For example, we may think of the “sex” attribute simply as “male sex,” and of its values as yes and no. Similarly, the outcome could be thought of as only “survival,” with values yes and no. This is the same as for the family history of asthma, which also has the values yes and no.
We could convey the same information as yes/no by using the numerical system. Therefore, we could give the attribute the value 1 to mean that it was present, and 0 to mean that it was absent. This is much more appropriate, because now we can think of binary variables not as categories, but as numerical variables that happen to take only two possible values, 0 and 1.
Furthermore, observations from binary variables are commonly presented as relative frequencies as in, for example, 37% of females or 14% with family history of asthma. If we adopt the 0/1 values for binary variables, those proportions are nothing more than the means of a variable with values 0 and 1. If males have value 0 and females 1, then in a sample of 200 subjects with 74 females the sum of the attribute sex totals 74 which, divided by 200 (the sample size), gives the result 0.37, or 37%.
1.4 Sampling
Sampling is such a central issue in biostatistics that an entire chapter of this book is devoted to discussing it. This is necessary for two main reasons: first, because an understanding of the statistical methods requires a clear understanding of the sampling phenomena; second, because most people do not understand at all the purpose of sampling.
Sampling is a relatively recent addition to statistics. For almost two centuries, statistical science was concerned only with census, the study of entire populations. Nearly a century ago, however, people realized that populations could be studied easier, faster, and more economically if observations were used from only a small part of the population, a sample of the population, instead of the whole population. The basic idea was that, provided a sufficient number of observations were made, the patterns of interest in the population would be reproduced in the sample. The measurements made in the sample would then mirror the measurements in the population.
This approach to sampling had, as a primary objective, to obtain a miniature version of the population. The assumption was that the observations made in the sample would reflect the structure of the population. This is very much like going to a store and asking for a sample taken at random from a piece of cloth. Later, by inspecting the sample, one would remember what the whole piece was like. By looking at the colors and patterns of the sample, one would know what the colors and patterns were in the whole piece (Figure 1.5).
Now, if the original piece of cloth had large, repetitive patterns but the sample was only a tiny piece, by looking at the sample one would not be able to tell exactly what the original piece was like. This is because not every pattern and color would be present in the sample, and the sample would be said not to be representative of the original cloth. Conversely, if the sample was large enough to contain all the patterns and colors present in the piece, the sample would be said to be representative (Figure 1.6).
This is very much the reasoning behind the classical approach to sampling. The concept of representativeness of a sample was tightly linked to its size: large samples tend to be representative, while small samples give unreliable results because they are not representative of the population. The fragility of this approach, however, is its lack of objectivity in the definition of an adequate sample size.
Figure 1.5 Classical view of the purpose of sampling.
Figure 1.6 Relationship between representativeness and sample size in the classic view of sampling. The concept of representativeness is closely related to sample size.
Some people might say that the sample size should be in proportion to the total population. If so, this would mean that an investigation on the prevalence of, say, chronic heart failure in Norway would require a much smaller sample than the same investigation in Germany. This makes little sense. Now suppose we want to investigate patients with chronic heart failure. Would a sample of 100 patients with chronic heart failure be representative? What about 400 patients? Or do we need 1000 patients? In each case, the sample size is always an almost insignificant fraction of the whole population.
If it does not make much sense to think that the ideal sample size is a certain proportion of the population (even more so because in many situations the population size is not even known), would a representative sample then be the one that contains all the patterns that exist in the population? If so, how many people will we have to sample to make sure that all possible patterns in the population also exist in the sample? For example, some findings typical of chronic heart failure, like an S3‐gallop and alveolar edema, are present in only 2 or 3% of patients, and the combination of these two findings (assuming they are independent) should exist in only 1 out of 2500 patients. Does this mean that no study of chronic heart failure with less than 2500 patients should be considered representative? And what to do when the structure of the population is unknown?
The problem of lack of objectivity in defining sample representativeness can be circumvented if we adopt a different reasoning when dealing with samples. Let us accept that we have no means of knowing what the population structure truly is, and all we can possibly have is a sample of the population. Then, a realistic procedure would be to look at the sample and, by inspecting its structure, formulate a hypothesis about the structure of the population. The structure of the sample constrains the hypothesis to be consistent with the observations.
Taking the above example on the samples of cloth, the situation now is as if we were given a sample of cloth and