http://support.sas.com/kuehne http://support.sas.com/obenchain http://support.sas.com/haro
Chapter 1: Introduction to Observational and Real World Evidence Research
1.2 Definition and Types of Real World Data (RWD)
1.3 Experimental Versus Observational Research
1.4 Types of Real World Studies
1.4.2 Retrospective or Case-control Studies
1.4.3 Prospective or Cohort Studies
1.5 Questions Addressed by Real World Studies
1.6 The Issues: Bias and Confounding
1.7 Guidance for Real World Research
1.8 Best Practices for Real World Research
1.1 Why This Book?
Advances in communication and information technologies have led to an exponential increase in the collection of real-world data. Data in the health sector are not only generated during clinical research but also during many instances of the patient-clinician relationship. Such data are then processed to administer and manage health services and stored by a greater number of health registries and medical devices. This data serves as the basis for the growing use of real world evidence (RWE) in medical decision-making. However, data itself is not evidence. A core element of producing RWE includes the use of designs and analytical methods that are both valid and appropriate for such data. This book is about the analytical methods used to turn real world data into valid and meaningful real world evidence.
In 2010, we produced a book, Analysis of Observational HealthCare Data Using SAS (Faries et al. 2010), to bring together in a single place many of the best practices for real-world and observational data research. A focus of that effort was to make the implementation of best practice analyses feasible by providing SAS Code with example applications. However, since that time there have been several improvements in analytic methods, coalescing of thoughts on best practices, and significant upgrades in SAS procedures targeted for real world research, such as the PSMATCH and CAUSALTRT procedures. In addition, the growing demand for real world evidence and interest in improving the quality of real world evidence to the level required for regulatory decision making has necessitated updating the prior work.
This book has the same general objective as the 2010 text: to bring together best practices in a single location and to provide SAS code and examples to make the analyses relatively easy and efficient. In addition, we use newer SAS procedures for efficient coding that allow for the implementation of previously challenging methods (such as optimal matching). We will also present several emerging topics of interest, including algorithms for personalized medicine, methods that address the complexities of time varying confounding, extensions of propensity scoring to comparisons between more than two interventions, sensitivity analyses for unmeasured confounding, use of real-world data to generalize RCT evidence, and implementation of model averaging. As before, implementation of foundational methods such as propensity score matching and stratification and weighting methods are still included in detail.
The main focus of this book is causal inference methods – or the challenge of producing valid comparisons of outcomes between intervention groups using non-randomized data sources. The remainder of this introductory chapter provides a brief overview of real world data, uses of real world data, designs and guidance for real world data research, and some general best practices. This serves as a reference and introductory reading prior to the detailed applications using SAS in later chapters.
1.2 Definition and Types of Real World Data (RWD)
Real world data has been defined by the International Society for Pharmacoeconomics and Outcome Research (ISPOR) as everything that goes beyond what is normally collected in the phase III clinical trials programs (RCTs) (Garrison et al. 2007). Similarly, the Duke-Margolis Center for Health Policy and the Food and Drug Administration define RWD as “data relating to patient health status and/or the delivery of health care routinely collected from a variety of sources.” These definitions include many different types and sources of data which are not limited to data from observational studies conducted in clinical setting but also electronic health records (EHRs), claims and billing data, product and disease registries, and data gathered through personal devices and health applications (NEHI 2015). RWD can comprise data from patients, clinicians, hospitals, payers and many other sources. There is some debate regarding the limits of RWD, since some institutions also consider pragmatic clinical trials to be RWD (Makady et al. 2015). Others describe pragmatic trials on a continuum between purely observational and clinical trial like based on a set of factors (Tosh et al. 2011). Note, in this book we use the terms “real world” and “observational” interchangeably.
1.3 Experimental Versus Observational Research
One of the main, if not the most, important objective of medicine is discovering the best treatment for each disease. To achieve this objective, medical researchers usually compare the effects of different treatments on the course of a disease with the randomized clinical trial (RCT) as the gold-standard design for such research. In an RCT, the investigator compares the outcomes of patients assigned to different treatments. To ensure a high degree of internal validity of the results, treatment assignment is usually random, which is expected to produce treatment groups that are similar at baseline regarding the factors that may determine the outcomes, such as disease severity, co-morbidities, or other prognostic factors. With this design, we assume that outcome differences among the groups are caused by differences in the efficacy of treatments. (See Chapter 2 for a technical discussion of causal inference.) Given that the research protocol decides who will receive a treatment, RCTs are considered experimental research. However, in observational research in which the investigators collect information without changing clinical practice, medications are not assigned to the patients randomly, but are prescribed by clinicians following their own criteria. This means that similarities between groups of patients receiving different treatments cannot be assumed. For example, assume that there are two treatments for a disease, one of which is known to be more effective but might produce more frequent and severe adverse events, and the other, which is much better tolerated but it is known to be less effective. Typically,