Finite-Sample Failures and Condition-Number Diagnostics in Double Machine Learning
By: Gabriel Saco
Standard Double Machine Learning (DML; Chernozhukov et al., 2018) confidence intervals can exhibit substantial finite-sample coverage distortions when the underlying score equations are ill-conditioned, even if nuisance functions are estimated with state-of-the-art methods. Focusing on the partially linear regression (PLR) model, we show that a simple, easily computed condition number for the orthogonal score, denoted kappa_DML := 1 / |J_theta|, largely determines when DML inference is reliable. Our first result derives a nonasymptotic, Berry-Esseen-type bound showing that the coverage error of the usual DML t-statistic is of order n^{-1/2} + sqrt(n) * r_n, where r_n is the standard DML remainder term summarizing nuisance estimation error. Our second result provides a refined linearization in which both estimation error and confidence interval length scale as kappa_DML / sqrt(n) + kappa_DML * r_n, so that ill-conditioning directly inflates both variance and bias. These expansions yield three conditioning regimes - well-conditioned, moderately ill-conditioned, and severely ill-conditioned - and imply that informative, shrinking confidence sets require kappa_DML = o_p(sqrt(n)) and kappa_DML * r_n -> 0. We conduct Monte Carlo experiments across overlap levels, nuisance learners (OLS, Lasso, random forests), and both low- and high-dimensional (p > n) designs. Across these designs, kappa_DML is highly predictive of finite-sample performance: well-conditioned designs with kappa_DML < 1 deliver near-nominal coverage with short intervals, whereas severely ill-conditioned designs can exhibit large bias and coverage around 40% for nominal 95% intervals, despite flexible nuisance fitting. We propose reporting kappa_DML alongside DML estimates as a routine diagnostic of score conditioning, in direct analogy to condition-number checks and weak-instrument diagnostics in IV settings.
Similar Papers
Perturbed Double Machine Learning: Nonstandard Inference Beyond the Parametric Length
Methodology
Improves computer learning when data is messy.
Neighborhood Stability in Double/Debiased Machine Learning with Dependent Data
Econometrics
Makes computer learning work better with connected data.
Double Debiased Machine Learning for Mediation Analysis with Continuous Treatments
Machine Learning (Stat)
Finds how one thing truly causes another.