claim a causal effect of drug A on outcome Y? That is, will taking treatment make the subject better or not better? The answer is “probably not,” even if we observe scenario 1 where the subject did get better after taking treatment A. Why? The subject might get better without taking drug A. Therefore, at an individual level, a causal relationship between the intervention (taking drug A) and an outcome cannot be established because we cannot observe the “counterfactual” outcome had the patient not taken such action.
If we were somehow able to know both the actual outcome of an intervention and the counterfactual outcome, that is, the outcome of the opposite, unobserved intervention (though in fact we are never able to observe the counterfactual outcome), then we could assess whether a causal effect exists between A and Y. Table 2.2 returns to the four possible scenarios in Table 2.1, but now with knowledge of both the outcome and the “counterfactual” outcome.
Table 2.2: Possible Causal Effect Scenarios
Unfortunately, in reality, we will not likely be able to observe both the outcome and its “counterfactual” simultaneously while keeping all other features of the subject unchanged. That is, we are not able to observe the “counterfactual” outcome on the same subject. This presents a critical challenge for assessing causal effect in research where causation is of interest. In summary, we might have to admit that understanding the causal relationship at the individual subject level is not attainable. Two approaches to address this issue are provided in Sections 2.3.3 and 2.3.4.
2.3 From R.A. Fisher to Modern Causal Inference Analyses
2.3.1 Fisher’s Randomized Experiment
For a long period of time, statisticians, even the great pioneers like Francis Galton and Karl Pearson, tended not to talk about causation but rather association or correlation (for example, Pearson’s correlation coefficient). Regression modeling was used as a tool to assess the association between a set of variables and the outcome of interest. The estimated regression coefficients sometimes were interpreted as causal effect (Yule, 1895, 1897, 1899), though such an interpretation could be misleading (Imbens and Rubin, 2015). Such confusion was not clarified until Sir Ronald Fisher brought clarity through the idea of a randomized experiment.
Fisher wrote a series of papers and books in the 1920s and 1930s (Fisher, 1922, 1930, 1936a, 1936b, 1937) on randomized experiments. Fisher stated that when comparing treatment effect between treatment and control groups, randomization could remove the systematic distortions that biased the causal treatment effect estimates. Note, the so-called “systematic distortions” could be either measured or unmeasured. With a perfect randomization, the control group will provide counterfactual outcomes for the observed performance in the treatment group, so that the causal effect can be estimated. So with randomization, a causal interpretation of the relationship between the treatment and the outcome is possible. Because of its ability to evaluate the causal treatment effect in a less biased manner, the concept of the randomized experiment was gradually accepted by researchers and regulators worldwide. Double-blinded, randomized clinical trials have become and remain the gold standard in seeking an approval of a human pharmaceutical product.
Randomized controlled trials (RCTs) remain at the top of the hierarchy of evidence largely because of their ability to generate causal interpretations for treatment effects. However, RCTs also have limitations:
1. it is not always possible to conduct an RCT due to ethical or practical constrains
2. they have great internal validity but often lack external validity (generalizability)
3. they are often not designed with sufficient power to study heterogeneous causal treatment effect (subgroup identification)
With the growing availability of large, real world health care data, there is a growing interest of non-randomized observational study for assessing the real world causal effects of interventions. Without randomization, proper assessment of causal effects is difficult. For example, in routine clinical practice, a group of patients receiving treatment A might be younger and healthier than another group of patients receiving treatment B, even if A and B have same target population and indication. Therefore, a direct comparison of the outcome between those two groups of patients could be biased because of the imbalances in important patient characteristics between the two groups. Variables that influence both the treatment choice and the outcome are confounders and their existence presents an important methodological challenge for estimating causal effect in non-randomized studies. So, what can one do? Fisher himself didn’t give an answer, but the idea of inferring causation through randomized experiment influenced the field of statistics and eventually lead to well-accepted causal frameworks, for example, a framework developed by Rubin and a framework developed by Pearl and Robins for inferring causation from non-randomized studies.
2.3.2 Neyman’s Potential Outcome Notation
Before formally introducing a causal framework, it is necessary to briefly review the notation of “potential outcomes.” Potential outcomes were first proposed by Neyman (1923) to explain causal effect in randomized experiments, but were not used elsewhere for decades before other statisticians realized their value in inferring causation in non-randomized studies.
Neyman’s notation begins as follows. Assume T=0 and T=1 are the two interventions or treatments for comparison, and Y is the outcome of interest. Every subject in the study has two potential outcomes:
2.3.3 Rubin’s Causal Model
Rubin’s Causal Model (RCM), was named by Holland (Holland, 1986) in recognition of the seminal work in this area conducted by Donald Rubin in 1970s and early 1980s (Rubin 1974, 1977, 1978, 1983). Below, we provide a brief description of the RCM and readers who are interested in learning more can read the numerous papers and books already written on this framework (Holland 1988, Little and Yau (1998), Angrist et al. (1996), Frangakis and Rubin (2002), Rubin (2004), Rubin (2005), Rosenbaum (2010), Rosenbaum (2017), Imbens and Rubin (2015)).
Using Neyman’s potential outcome notation, the individual causal treatment effect between two treatments T=0 and T=1 can be defined as:
Note, though we are able to define the individual causal treatment effect in theory, it is NOT estimable because we can only observe one potential outcome of the same subject while keeping other confounders unchanged. Instead, we can define other types of causal treatment effect that are estimable (“estimand”). For example, the average causal treatment effect (ATE),
where
In