Design-based finite-sample analysis for regression adjustment
By: Dogyoon Song
Potential Business Impact:
Makes study results more accurate, even with lots of data.
In randomized experiments, regression adjustment leverages covariates to improve the precision of average treatment effect (ATE) estimation without requiring a correctly specified outcome model. Although well understood in low-dimensional settings, its behavior in high-dimensional regimes -- where the number of covariates $p$ may exceed the number of observations $n$ -- remains underexplored. Furthermore, existing theory is largely asymptotic, providing limited guidance for finite-sample inference. We develop a design-based, non-asymptotic analysis of the regression-adjusted ATE estimator under complete randomization. Specifically, we derive finite-sample-valid confidence intervals with explicit, instance-adaptive widths that remain informative even when $p > n$. These intervals rely on oracle (population-level) quantities, and we also outline data-driven envelopes that are computable from observed data. Our approach hinges on a refined swap sensitivity analysis: stochastic fluctuation is controlled via a variance-adaptive Doob martingale and Freedman's inequality, while design bias is bounded using Stein's method of exchangeable pairs. The analysis suggests how covariate geometry governs concentration and bias through leverages and cross-leverages, shedding light on when and how regression adjustment improves on the difference-in-means baseline.
Similar Papers
Neumann-series corrections for regression adjustment in randomized experiments
Statistics Theory
Helps experiments use more information to get better results.
Conditional cross-fitting for unbiased machine-learning-assisted covariate adjustment in randomized experiments
Methodology
Makes study results more accurate with less data.
Regression adjustment in covariate-adaptive randomized experiments with missing covariates
Methodology
Fixes missing data in medical tests for better results.