Modern causal inference approaches to improve power for subgroup analysis in randomized controlled trials
By: Antonio D'Alessandro , Jiyu Kim , Samrachana Adhikari and more
Potential Business Impact:
Find drug effects in small patient groups.
Randomized controlled trials (RCTs) often include subgroup analyses to assess whether treatment effects vary across pre-specified patient populations. However, these analyses frequently suffer from small sample sizes which limit the power to detect heterogeneous effects. Power can be improved by leveraging predictors of the outcome -- i.e., through covariate adjustment -- as well as by borrowing external data from similar RCTs or observational studies. The benefits of covariate adjustment may be limited when the trial sample is small. Borrowing external data can increase the effective sample size and improve power, but it introduces two key challenges: (i) integrating data across sources can lead to model misspecification, and (ii) practical violations of the positivity assumption -- where the probability of receiving the target treatment is near-zero for some covariate profiles in the external data -- can lead to extreme inverse-probability weights and unstable inferences, ultimately negating potential power gains. To account for these shortcomings, we present an approach to improving power in pre-planned subgroup analyses of small RCTs that leverages both baseline predictors and external data. We propose debiased estimators that accommodate parametric, machine learning, and nonparametric Bayesian methods. To address practical positivity violations, we introduce three estimators: a covariate-balancing approach, an automated debiased machine learning (DML) estimator, and a calibrated DML estimator. We show improved power in various simulations and offer practical recommendations for the application of the proposed methods. Finally, we apply them to evaluate the effectiveness of citalopram for negative symptoms in first-episode schizophrenia patients across subgroups defined by duration of untreated psychosis, using data from two small RCTs.
Similar Papers
Utilizing subgroup information in random-effects meta-analysis of few studies
Methodology
Improves medical study results with few data points.
Machine learning to optimize precision in the analysis of randomized trials: A journey in pre-specified, yet data-adaptive learning
Methodology
Helps doctors get more accurate results from medical tests.
Sample size and power calculations for causal inference of observational studies
Methodology
Helps studies find real causes with less data.