Score: 1

Statistical Guarantees for High-Dimensional Stochastic Gradient Descent

Published: October 13, 2025 | arXiv ID: 2510.12013v1

By: Jiaqi Li , Zhipeng Lou , Johannes Schmidt-Hieber and more

Potential Business Impact:

Makes computer learning work better in complex situations.

Business Areas:
A/B Testing Data and Analytics

Stochastic Gradient Descent (SGD) and its Ruppert-Polyak averaged variant (ASGD) lie at the heart of modern large-scale learning, yet their theoretical properties in high-dimensional settings are rarely understood. In this paper, we provide rigorous statistical guarantees for constant learning-rate SGD and ASGD in high-dimensional regimes. Our key innovation is to transfer powerful tools from high-dimensional time series to online learning. Specifically, by viewing SGD as a nonlinear autoregressive process and adapting existing coupling techniques, we prove the geometric-moment contraction of high-dimensional SGD for constant learning rates, thereby establishing asymptotic stationarity of the iterates. Building on this, we derive the $q$-th moment convergence of SGD and ASGD for any $q\ge2$ in general $\ell^s$-norms, and, in particular, the $\ell^{\infty}$-norm that is frequently adopted in high-dimensional sparse or structured models. Furthermore, we provide sharp high-probability concentration analysis which entails the probabilistic bound of high-dimensional ASGD. Beyond closing a critical gap in SGD theory, our proposed framework offers a novel toolkit for analyzing a broad class of high-dimensional learning algorithms.

Country of Origin
πŸ‡ΊπŸ‡Έ πŸ‡³πŸ‡± United States, Netherlands

Page Count
37 pages

Category
Statistics:
Machine Learning (Stat)