Score: 0

Wasserstein-p Central Limit Theorem Rates: From Local Dependence to Markov Chains

Published: January 13, 2026 | arXiv ID: 2601.08184v1

By: Yixuan Zhang, Qiaomin Xie

Finite-time central limit theorem (CLT) rates play a central role in modern machine learning (ML). In this paper, we study CLT rates for multivariate dependent data in Wasserstein-$p$ ($\mathcal W_p$) distance, for general $p\ge 1$. We focus on two fundamental dependence structures that commonly arise in ML: locally dependent sequences and geometrically ergodic Markov chains. In both settings, we establish the \textit{first optimal} $\mathcal O(n^{-1/2})$ rate in $\mathcal W_1$, as well as the first $\mathcal W_p$ ($p\ge 2$) CLT rates under mild moment assumptions, substantially improving the best previously known bounds in these dependent-data regimes. As an application of our optimal $\mathcal W_1$ rate for locally dependent sequences, we further obtain the first optimal $\mathcal W_1$--CLT rate for multivariate $U$-statistics. On the technical side, we derive a tractable auxiliary bound for $\mathcal W_1$ Gaussian approximation errors that is well suited to studying dependent data. For Markov chains, we further prove that the regeneration time of the split chain associated with a geometrically ergodic chain has a geometric tail without assuming strong aperiodicity or other restrictive conditions. These tools may be of independent interests and enable our optimal $\mathcal W_1$ rates and underpin our $\mathcal W_p$ ($p\ge 2$) results.

Category
Mathematics:
Probability