Statistical Inference for Conditional Group Distributionally Robust Optimization with Cross-Entropy Loss
By: Zijian Guo , Zhenyu Wang , Yifan Hu and more
Potential Business Impact:
Helps computers learn from many different examples.
In multi-source learning with discrete labels, distributional heterogeneity across domains poses a central challenge to developing predictive models that transfer reliably to unseen domains. We study multi-source unsupervised domain adaptation, where labeled data are drawn from multiple source domains and only unlabeled data from a target domain. To address potential distribution shifts, we propose a novel Conditional Group Distributionally Robust Optimization (CG-DRO) framework that learns a classifier by minimizing the worst-case cross-entropy loss over the convex combinations of the conditional outcome distributions from the sources. To solve the resulting minimax problem, we develop an efficient Mirror Prox algorithm, where we employ a double machine learning procedure to estimate the risk function. This ensures that the errors of the machine learning estimators for the nuisance models enter only at higher-order rates, thereby preserving statistical efficiency under covariate shift. We establish fast statistical convergence rates for the estimator by constructing two surrogate minimax optimization problems that serve as theoretical bridges. A distinguishing challenge for CG-DRO is the emergence of nonstandard asymptotics: the empirical estimator may fail to converge to a standard limiting distribution due to boundary effects and system instability. To address this, we introduce a perturbation-based inference procedure that enables uniformly valid inference, including confidence interval construction and hypothesis testing.
Similar Papers
Group Distributionally Robust Machine Learning under Group Level Distributional Uncertainty
Machine Learning (CS)
Makes AI fair for everyone, even small groups.
Distributionally Robust Optimization with Adversarial Data Contamination
Machine Learning (CS)
Protects computer learning from bad data and changes.
Mitigating Spurious Correlation via Distributionally Robust Learning with Hierarchical Ambiguity Sets
Machine Learning (CS)
Makes AI work better when data changes.