The Honest Truth About Causal Trees: Accuracy Limits for Heterogeneous Treatment Effect Estimation
By: Matias D. Cattaneo, Jason M. Klusowski, Ruiqi Rae Yu
Potential Business Impact:
Makes smart computer guesses about what works better.
Recursive decision trees have emerged as a leading methodology for heterogeneous causal treatment effect estimation and inference in experimental and observational settings. These procedures are fitted using the celebrated CART (Classification And Regression Tree) algorithm [Breiman et al., 1984], or custom variants thereof, and hence are believed to be "adaptive" to high-dimensional data, sparsity, or other specific features of the underlying data generating process. Athey and Imbens [2016] proposed several "honest" causal decision tree estimators, which have become the standard in both academia and industry. We study their estimators, and variants thereof, and establish lower bounds on their estimation error. We demonstrate that these popular heterogeneous treatment effect estimators cannot achieve a polynomial-in-$n$ convergence rate under basic conditions, where $n$ denotes the sample size. Contrary to common belief, honesty does not resolve these limitations and at best delivers negligible logarithmic improvements in sample size or dimension. As a result, these commonly used estimators can exhibit poor performance in practice, and even be inconsistent in some settings. Our theoretical insights are empirically validated through simulations.
Similar Papers
Honesty in Causal Forests: When It Helps and When It Hurts
Machine Learning (CS)
Improves computer predictions of what works best for people.
Staged Event Trees for Transparent Treatment Effect Estimation
Methodology
Shows how to tell if a treatment really works.
Reliable Selection of Heterogeneous Treatment Effect Estimators
Machine Learning (Stat)
Finds the best way to treat each person.