Uwe Siebert

Real World Health Care Data Analysis


Скачать книгу

values were used to measure the strength. The terms strongly associated with the intervention options are included in the final propensity score model.

      Imbens and Rubin (2015) proposed an iterative approach in constructing the propensity score model. First, the covariates that are viewed as important for explaining the intervention assignment and possibly related to the outcomes will be included. Second, the remaining covariates will be added to the model iteratively based on likelihood ratio statistics that test the hypothesis whether the added single covariate would have a coefficient of 0. Last, the higher-order and interactions of the single covariates selected in the second step will be added to the existing model iteratively and will be included if the likelihood ratio statistics exceed a pre-specified value.

      However, for these two methods, the authors do not provide specific guidelines for selecting values for the t-statistic values or the likelihood ratio statistic values. Instead, they consider a range of values and a range of the corresponding estimated treatment effects. Those issues made it difficult to implement this approach as an automatic model selection approach for propensity score estimation.

      In parametric modeling, we always assume a data model with unknown parameters and use the data to estimate those model parameters. Therefore, a mis-specified model can cause significant bias in estimating propensity scores. Contrary to the parametric approach, nonparametric models build the relationship between an outcome and predictors through a learning algorithm without an a priori data model. Classification and regression trees (CART) are a well-known example of a nonparametric approach. To estimate propensity score, they partition a data set into regions such that within each region, observations are as homogeneous as possible so that they will have similar probabilities of receiving treatment. CART has advantageous properties, including the ability to handle missing data without imputation and is insensitive to outliers. Additionally, interactions and non-linearities are modeled naturally as a result of the partitioning process instead of a priori specification. However, CART has difficulty in modeling smooth functions and is sensitive to overfitting.

      To remedy these limitations, several approaches have been proposed, such as the pruned CART to address overfitting. Bootstrap aggregated (bagged) CART involves fitting a CART to a bootstrap sample with replacement and of the original sample size, repeated many times. For each observation, the number of times it is classified into a category by the set of trees is counted, with the final assignment of the treatment based on an average or majority vote over all the trees. Random forests are similar to bagging, but they use a random subsample of predictors in the construction of each CART.

      Another approach, boosted CART, has been shown to outperform alternative methods in terms of prediction error. The boosted CART goes through multiple iterations of tree fitting on random subsets of the data like the bagged CART or random forest. However, with each iteration, a new tree gives greater priority to the data points that were incorrectly classified with the previous tree. This method adds together many simple functions to estimate a smooth function of a large number of covariates. While each individual simple function might be a poor approximation to the function of interest, together they are able to approximate a smooth function just as a sequence of linear segments can approximate a smooth curve.

      As McCaffrey et al. (2004) suggested, the gradient boosting algorithm should stop at the number of iterations that minimizes the average standardized absolute mean difference (ASAM) in the covariates. The operating characteristics of these algorithms depends on hyper-parameter values that guide the model development process. The default values of these hyper-parameters might be suitable for some but not for other applications. While xgboost (Chen, 2015, 2016) has been in the open-source community for several years, SAS Viya provides its own gradient boosting CAS action (gbtreetrain) and accompanying procedure (PROC GRADBOOST). Both are similar to xgboost, yet have some nice enhancements sprinkled throughout. One huge bonus is the auto-tuning feature, which is the AUTOTUNE statement in GRADBOOST, and it could help identifying the best settings for those hyper-parameters in each individual user cases, so that researchers do not need to manually adjust the hyper-parameters. Notice that PROC GRADBOOST aims to minimize the prediction error but not ASAM, and more research need to be done to understand how to optimize PROC GRADBOOST when the criteria is ASAM in the covariates. Program 4.10 illustrates how to use GRADBOOST for building the boosted CART model.

      Program 4.10: Gradient Boosting Model for Propensity Score Estimation

      * gradient boosting for PS estimation: tune hyper-parameters, fit the tuned model, and obtain PS;

      proc gradboost data=REFL seed=117 earlystop(stagnation=10);

      autotune kfold=5 popsize=30;

      id subjid cohort;

      target cohort/level=nominal;

      input

      Gender

      Race

      DrSpecialty/level=nominal;

      input

      DxDur

      Age

      BMI_B

      BPIInterf_B

      BPIPain_B

      CPFQ_B

      FIQ_B

      GAD7_B

      ISIX_B

      PHQ8_B

      PhysicalSymp_B

      SDS_B/level=interval;

      output out=mycas.dps;

      run;

      * our focus is on PS=P(opioid);

      data lib.dps;

      set mycas.dps;

      PS=P_Cohortopioid;

      run;

      A natural question to ask is which of the three propensity score estimation approaches should be used in a particular study – and there is no definitive answer to this question. Parametric models are easier to interpret, and the a priori model approach allows researchers to incorporate the knowledge outside the data to the model building, for example, clinical evidence on which variable should be included. However, the risk of model mis-specification is not ignorable. The nonparametric CART approach performs well in predicting the treatment given the data, especially the boosted CART approach. In addition, CARTs handle the missing data naturally in the partition process so that they don’t require imputation of the missing covariate values. However, the CART approach is not as interpretable as the parametric modeling approach and prior knowledge is difficult to incorporate as the CARTs are a data-driven process. We suggest the researchers assess the quality of the propensity score estimates and use the desired quality to drive the model selection. For the remaining of this section, we will discuss some proposed criteria of evaluating the quality of propensity score estimates.

      As a reminder of what we presented earlier in this chapter, the ultimate goal of using propensity scores in observational studies is to create the balance in distributions of the confounding covariates between the treatment and control groups. Thus, a “good” propensity score estimate should be able to induce good balance between the comparison groups. Imbens and Rubin (2015) provided the following approach in assessing such balance.

      1. Stratify the subjects based on their estimated propensity score. For more details, see section 13.5 of Imbens and Rubin (2015).

      2. Assess the global balance for each covariate across strata. Calculate the sampling mean and variance of the difference of the th covariate between treatment and control group within each strata, then use the weighted mean and variance of to form a test statistic for the null hypothesis that the weighted average mean is 0. Under the null hypothesis, the test statistic is normally distributed. Therefore, if the z-values are substantially larger than the absolute value of 1, then the balance of is achieved.

      3. Assess the balance for each covariate within each stratum (for all strata). Calculate the sample mean