Assumption-lean weak limits and tests for two-stage adaptive experiments
By: Ziang Niu, Zhimei Ren
Potential Business Impact:
Makes online tests fairer and more accurate.
Adaptive experiments are becoming increasingly popular in real-world applications for effectively maximizing in-sample welfare and efficiency by data-driven sampling. Despite their growing prevalence, however, the statistical foundations for valid inference in such settings remain underdeveloped. Focusing on two-stage adaptive experimental designs, we address this gap by deriving new weak convergence results for mean outcomes and their differences. In particular, our results apply to a broad class of estimators, the weighted inverse probability weighted (WIPW) estimators. In contrast to prior works, our results require significantly weaker assumptions and sharply characterize phase transitions in limiting behavior across different signal regimes. Through this common lens, our general results unify previously fragmented results under the two-stage setup. To address the challenge of potential non-normal limits in conducting inference, we propose a computationally efficient and provably valid plug-in bootstrap method for hypothesis testing. Our results and approaches are sufficiently general to accommodate various adaptive experimental designs, including batched bandit and subgroup enrichment experiments. Simulations and semi-synthetic studies demonstrate the practical value of our approach, revealing statistical phenomena unique to adaptive experiments.
Similar Papers
Simulation-Based Inference for Adaptive Experiments
Methodology
Finds best treatments faster, helps more people.
Design Stability in Adaptive Experiments: Implications for Treatment Effect Estimation
Statistics Theory
Helps experiments learn faster by changing rules.
Statistical Inference for Misspecified Contextual Bandits
Statistics Theory
Makes smart computer tests reliable and fair.