Score: 0

Tight Analysis of Decentralized SGD: A Markov Chain Perspective

Published: January 11, 2026 | arXiv ID: 2601.07021v1

By: Lucas Versini, Paul Mangold, Aymeric Dieuleveut

Potential Business Impact:

Makes computer learning faster with more helpers.

Business Areas:
Ethereum Blockchain and Cryptocurrency

We propose a novel analysis of the Decentralized Stochastic Gradient Descent (DSGD) algorithm with constant step size, interpreting the iterates of the algorithm as a Markov chain. We show that DSGD converges to a stationary distribution, with its bias, to first order, decomposable into two components: one due to decentralization (growing with the graph's spectral gap and clients' heterogeneity) and one due to stochasticity. Remarkably, the variance of local parameters is, at the first-order, inversely proportional to the number of clients, regardless of the network topology and even when clients' iterates are not averaged at the end. As a consequence of our analysis, we obtain non-asymptotic convergence bounds for clients' local iterates, confirming that DSGD has linear speed-up in the number of clients, and that the network topology only impacts higher-order terms.

Page Count
35 pages

Category
Computer Science:
Machine Learning (CS)