A Two-armed Bandit Framework for A/B Testing
By: Jinjuan Wang , Qianglin Wen , Yu Zhang and more
Potential Business Impact:
Tests new ideas faster and more reliably.
A/B testing is widely used in modern technology companies for policy evaluation and product deployment, with the goal of comparing the outcomes under a newly-developed policy against a standard control. Various causal inference and reinforcement learning methods developed in the literature are applicable to A/B testing. This paper introduces a two-armed bandit framework designed to improve the power of existing approaches. The proposed procedure consists of three main steps: (i) employing doubly robust estimation to generate pseudo-outcomes, (ii) utilizing a two-armed bandit framework to construct the test statistic, and (iii) applying a permutation-based method to compute the $p$-value. We demonstrate the efficacy of the proposed method through asymptotic theories, numerical experiments and real-world data from a ridesharing company, showing its superior performance in comparison to existing methods.
Similar Papers
Beyond ATE: Multi-Criteria Design for A/B Testing
Methodology
Tests help make more money and keep data private.
Beyond Basic A/B testing: Improving Statistical Efficiency for Business Growth
Methodology
Improves website tests for better business results.
Beyond Normality: Reliable A/B Testing with Non-Gaussian Data
Machine Learning (Stat)
Fixes online tests to make better choices.