Wasserstein-p Central Limit Theorem Rates: From Local Dependence to Markov Chains
By: Yixuan Zhang, Qiaomin Xie
Finite-time central limit theorem (CLT) rates play a central role in modern machine learning (ML). In this paper, we study CLT rates for multivariate dependent data in Wasserstein-$p$ ($\mathcal W_p$) distance, for general $p\ge 1$. We focus on two fundamental dependence structures that commonly arise in ML: locally dependent sequences and geometrically ergodic Markov chains. In both settings, we establish the \textit{first optimal} $\mathcal O(n^{-1/2})$ rate in $\mathcal W_1$, as well as the first $\mathcal W_p$ ($p\ge 2$) CLT rates under mild moment assumptions, substantially improving the best previously known bounds in these dependent-data regimes. As an application of our optimal $\mathcal W_1$ rate for locally dependent sequences, we further obtain the first optimal $\mathcal W_1$--CLT rate for multivariate $U$-statistics. On the technical side, we derive a tractable auxiliary bound for $\mathcal W_1$ Gaussian approximation errors that is well suited to studying dependent data. For Markov chains, we further prove that the regeneration time of the split chain associated with a geometrically ergodic chain has a geometric tail without assuming strong aperiodicity or other restrictive conditions. These tools may be of independent interests and enable our optimal $\mathcal W_1$ rates and underpin our $\mathcal W_p$ ($p\ge 2$) results.
Similar Papers
Nonasymptotic CLT and Error Bounds for Two-Time-Scale Stochastic Approximation
Machine Learning (CS)
Makes computer learning faster and more accurate.
Nonasymptotic CLT and Error Bounds for Two-Time-Scale Stochastic Approximation
Machine Learning (CS)
Improves computer learning speed and accuracy.
Fast Wasserstein rates for estimating probability distributions of probabilistic graphical models
Statistics Theory
Helps computers learn from less information.