people die annually from about 65 million people in the United Kingdom. Hence, for a single individual, with no information about their age or state of health, the chance or probability of dying in any particular year is 600 000/65 000 000 = 0.009 or just under 1 in 100. This is termed the crude mortality rate as it ignores differences in individuals due, for example, to their gender or age, which are both known to influence mortality. From year‐to‐year this probability of dying is fairly stable (see Figure 4.2), although there has been a long‐term decline over the years in the probability of dying. This illustrates that the number of deaths in a group can be accurately predicted but, despite this, it is not possible to predict exactly which particular individuals are going to die.
Figure 4.2 Crude mortality rates in the United Kingdom from 1982 to 2016.
(Source: data from ONS 2017, https://www.ons.gov.uk/peoplepopulationandcommunity/birthsdeathsandmarriages/deaths/datasets/deathregistrationssummarytablesenglandandwalesdeathsbysingleyearofagetables).
The basis of the idea of probability is a sequence of what are known as independent trials. To calculate the probability of an individual dying in one year we give each one of a group of individuals a trial over a year and the event occurs if the individual dies. As already indicated, the estimate of the (crude) probability of dying is the number of deaths divided by the number in the original group. The idea of independence is difficult, but is based on the fact that whether or not one individual survives or dies does not affect the chance of another individual's survival.
On a very simple level and where the probability of an event is known in advance, consider tossing one coin repeatedly a large number of times. Each toss of the coin is an ‘experiment’ or ‘trial’ and the results of this experiment or trial will be an outcome (head or tail). If the coin is unbiased, that is one which has no preference for ‘heads’ or ‘tails’, we would expect heads half of the time and thus say the probability of a head is 0.5. This leads to the long‐term relative frequency definition of probability, which is that the probability of a specific outcome is the proportion of times that the specific outcome would happen if we repeated the experiment a large number of times.
For example, the proportion of male births out of total births in England and Wales in 2016 was 357 046/696 271 = 0.51. Since all births in England and Wales must be registered within 42 days of the child being born and this a large sample, we can use this an estimate of the probability that a baby born in England and Wales will be male.
Similarly, when it is stated that patients with a certain disease have a 50% chance of surviving five years, this is based on past experience of other patients with the same disease. In some cases a ‘trial’ may be generated by randomly selecting an individual from the general population, as discussed in Chapters 5 and 14, and examining him or her for the particular attribute in question. For example, suppose the prevalence of diabetes in the population is 1%. The prevalence of a disease is the number of people in a population with the disease at a certain time divided by the number of people in the population (see Chapter 14 for further details). If a trial was then conducted by randomly selecting one person from the population and testing him or her for diabetes, the individual would be expected to be diabetic with probability 0.01. If this type of sampling of individuals from the population were repeated, then the proportion of diabetics in the total sample taken would be expected to be approximately 1%.
However, in some situations we can determine probabilities without repeated sampling. For example, we know that the possibility of a ‘6’ when throwing a six‐sided die is 1/6, because there are six possibilities, all equally likely. Nevertheless, we may wish to conduct a series of trials to verify this fact.
In genetics, if a child has cystic fibrosis (CF) but neither parent was affected, then it is known that each parent must have genotype cC, where c denotes a CF gene and C denotes a normal gene. The possibility that one of the parents is cc is discounted, as this would imply that one parent had CF. In any subsequent child in that family there are four possible and equally likely (mother–father) genotype combinations: cc, Cc, cC, and CC. Only cc leads to the disease. Thus, it is known that the probability of a subsequent child being affected is 1/4, and if the child is not affected (and so is Cc, cC or CC), the probability of being a carrier (type Cc or cC) is 2/3. These ‘model based’ probabilities are based on the Mendelian theory of genetics.
Another type of probability is ‘subjective’ probability. When a patient presents with chest pains, a clinician may, after a preliminary examination, say that the probability that the patient has heart disease is about 20%. However, although the clinician does not know this yet, the individual patient either has or has not got heart disease. Thus, at this early stage of investigation the probability is a measure of the strength or degree of the belief of the clinician in the two alternative hypotheses, that the patient has got heart disease or has not got heart disease. The next step is then to proceed to further examinations of the patient in order to modify the strength of this initial subjective belief so that the clinician becomes more certain of which is the true situation – the patient has heart disease or the patient does not. We commonly come across subjective probability in the gaming industry. The odds of a horse winning a race, for example, are a measure of how likely a bookmaker thinks it will win. It is based not just on how often the horse has won before, but also on other factors such as the jockey and the course conditions.
In some circumstances we will have some prior knowledge or belief about the chance or likelihood of an event and as long as it can be quantified, it is possible to combine this prior belief with the observed frequency data to give an updated and better estimate of the probability of the event. An example of this is when we apply Bayes' Theorem (see Chapter 13) to diagnostic data and use the prevalence and sensitivity to give us the positive predictive value. Further application using statistical distributions of degrees of belief to modify data gives raise to the body of statistical methods known as Bayesian statistics.
The three types of probability all have the following basic properties.
1 All probabilities lie between 0 and 1.
2 If two outcomes or events are mutually exclusive so that they both cannot occur at the same time, the probability of either happening is the sum of the two individual probabilities (this is known as the ‘addition rule’).
3 If two outcomes are independent (i.e. knowing the outcome of one experiment tells us nothing about the other experiment), then the probability of both occurring is the product of the individual probabilities (this is known as the ‘multiplication rule’).
When the outcome can never happen the probability is 0. When the outcome will definitely happen the probability is 1. If two events are mutually exclusive then only one can happen. For example, the outcome of a trial might be death (probability 5%) and severe disability (probability 20%). Thus, by the addition rule the probability of either death or severe disability is 25%.
If two events are independent then the fact that one has happened does not affect the chance of the other event happening. For example, the probability that a pregnant woman gives birth to a boy (event A) and the probability of white Christmas (event B). These two events are unconnected since the probability of giving birth to a boy is not related to the weather at Christmas.