Dynamic Synthetic Controls vs. Panel-Aware Double Machine Learning for Geo-Level Marketing Impact Estimation
By: Sang Su Lee, Vineeth Loganathan, Vijay Raghavan
Potential Business Impact:
Helps businesses know if ads really work.
Accurately quantifying geo-level marketing lift in two-sided marketplaces is challenging: the Synthetic Control Method (SCM) often exhibits high power yet systematically under-estimates effect size, while panel-style Double Machine Learning (DML) is seldom benchmarked against SCM. We build an open, fully documented simulator that mimics a typical large-scale geo roll-out: N_unit regional markets are tracked for T_pre weeks before launch and for a further T_post-week campaign window, allowing all key parameters to be varied by the user and probe both families under five stylized stress tests: 1) curved baseline trends, 2) heterogeneous response lags, 3) treated-biased shocks, 4) a non-linear outcome link, and 5) a drifting control group trend. Seven estimators are evaluated: three standard Augmented SCM (ASC) variants and four panel-DML flavors (TWFE, CRE/Mundlak, first-difference, and within-group). Across 100 replications per scenario, ASC models consistently demonstrate severe bias and near-zero coverage in challenging scenarios involving nonlinearities or external shocks. By contrast, panel-DML variants dramatically reduce this bias and restore nominal 95%-CI coverage, proving far more robust. The results indicate that while ASC provides a simple baseline, it is unreliable in common, complex situations. We therefore propose a 'diagnose-first' framework where practitioners first identify the primary business challenge (e.g., nonlinear trends, response lags) and then select the specific DML model best suited for that scenario, providing a more robust and reliable blueprint for analyzing geo-experiments.
Similar Papers
Time-Aware Synthetic Control
Machine Learning (CS)
Helps predict future events better with time patterns.
Efficiently Learning Synthetic Control Models for High-dimensional Disaggregated Data
Methodology
Finds what caused job losses during lockdowns.
Double Debiased Machine Learning for Mediation Analysis with Continuous Treatments
Machine Learning (Stat)
Finds how one thing truly causes another.