are listed in the Bibliography. For assistance in cross‐referencing, we classify items according to chapter. Thus, Section 9.1, Figure 9.1, Table 9.1 and Example 9.1 are all to be found in Chapter 9.
2 Health Care Investigations: Measurement and Sampling Concepts
2.1 Introduction
A health care investigation is typically a five‐stage process: identifying objectives; planning; data collection; analysis; and, finally, reporting. The methodologies frequently used are sample surveys, clinical trials and epidemiological studies. These are the subject of this and subsequent chapters. However, we must first be clear about the definitions of some basic terms. Many of the terms used in statistics are used in daily life, where their meanings might be quite different. The word ‘population’ may conjure images of ‘people’, while ‘sample’ might mean a ‘free sample’ of cream offered by a pharmaceutical company, or a ‘sample’ requested by a doctor for urine analysis. In statistics, however, these words have much more precise meanings.
2.2 Populations, Samples and Observations
In statistics, the term ‘population’ is extended to mean any collection of individual items or units that are the subject of investigation. Characteristics of a population that differ from individual to individual are called variables. Length, age, weight, temperature, and number of heart beats per minute, are examples of variables to which numbers or values can be assigned. Once numbers or values have been assigned to the variables, they can be measured.
Because it is rarely practicable to obtain measures of a particular variable from all the units in a population, the investigator has to collect information from a smaller group or sub‐set that represents the group as a whole. This sub‐set is called a sample. Each unit in the sample provides a record, such as a measurement, which is called an observation. The relationship between the terms we have introduced is summarized below:
Observation: | 3.62 kg |
Variable: | weight |
Sample unit (item): | a new‐born male baby |
Sample: | those new‐born male babies that are weighed |
Statistical population: | all new‐born male babies that are available for weighing. |
Note that the biological or demographic population would include babies of both sexes and, indeed, all individuals of whatever age or sex in a particular community.
2.3 Counting Things – The Sampling Unit
We sometimes wish to count the number of items or objects in a group or collection. If the number is to be meaningful, the dimensions of the collection have to be specified.
For example, ‘the number of patients admitted to an accident and emergency department’ has little meaning unless we know the time scale over which the count was made. A collection with specified dimensions is called a sampling unit. An observation is, of course, the number of objects or items counted in a sampling unit. Thus, if 52 patients are admitted to a particular Accident and Emergency (A&E) department in a 24‐hour period, the sampling unit is ‘one A&E 24‐hour period’ and the observation is 52. The sample is the number of such 24‐hour periods that were included in the survey. However, the definition of the ‘population’ requires care. It might be tempting to think that the population under investigation is something to do with patients, but this is not the case when they are being counted. The statistical population comprises the same ‘thing’ as the sample units that comprise the sample. In this case, the statistical population is a rather abstract concept, and represents all possible A&E department 24‐hour periods that could have been included in the survey.
It is very important to be able to identify correctly the population under investigation, because this is essential in formulating a ‘null hypothesis’ when undertaking statistical tests. This is the subject of Chapter 11.
2.4 Sampling Strategy
As we said above, it is not always possible or practicable to sample every single individual or unit in a particular population either due to its size, or constraints on available resources (for example, cost, time, manpower). The solution is to take a sample from the population of interest and use the sample information to make inferences about the population.
A common, but misguided, approach to sampling is to first decide what data to collect, then undertake the survey, and finally, decide what analyses should be done. However, without initial thought being given to the aims of the survey, the information or data may not be appropriate (e.g. wrong data collected, or data collected on wrong subjects, or insufficient data collected). As a result, the desired analysis may not be possible or effective.
The key to good sampling is to:
1 Formulate the aims of the study.
2 Decide what analysis is required to satisfy these aims.
3 Decide what data are required to facilitate the analysis.
4 Collect the data required by the survey.
The crucial point relates to the sequence. For example, if the aim of a study is to identify the effectiveness of asthmatic care within a single GP practice, suitable measures of effectiveness need to be defined. One measure could be based on the number of acute asthma exacerbations (deteriorations) in the preceding 12 months, and this number could be compared with that for the previous 12 months. Other measures might assess the number of patients who have had their inhaler technique checked or are using peak flow meters at home. Most of this information can be obtained from practice records, although crosschecking with hospital records may be required to validate the assessment based on acute exacerbations.
2.5 Target and Study Populations
We have to distinguish between the target and study populations. The target population in the asthma example above is the number of patients registered with the GP practice who have asthma. The study population consists of all patients who could actually be selected to form the sample, i.e. those who are known to have asthma. For example, a proportion of the target population may not know they have asthma, will not therefore be registered and, thus, will not form part of the study population. Ideally, the ‘target’ and ‘study’ populations coincide.
2.6 Sample Designs
Once the study population has been defined, the next task is to decide which subjects from the population should form the sample. The following list is not exhaustive, but gives a selection of sample designs:
simple random sampling,
systematic sampling,
stratified sampling,
quota sampling and
cluster sampling.
The first three designs can be applied to sampling from finite populations, i.e. situations where every member of the