Score: 0

Optimal Complexity in Byzantine-Robust Distributed Stochastic Optimization with Data Heterogeneity

Published: March 20, 2025 | arXiv ID: 2503.16337v1

By: Qiankun Shi , Jie Peng , Kun Yuan and more

Potential Business Impact:

Makes computers work better with bad data.

Business Areas:

A/B Testing Data and Analytics

In this paper, we establish tight lower bounds for Byzantine-robust distributed first-order stochastic optimization methods in both strongly convex and non-convex stochastic optimization. We reveal that when the distributed nodes have heterogeneous data, the convergence error comprises two components: a non-vanishing Byzantine error and a vanishing optimization error. We establish the lower bounds on the Byzantine error and on the minimum number of queries to a stochastic gradient oracle required to achieve an arbitrarily small optimization error. Nevertheless, we identify significant discrepancies between our established lower bounds and the existing upper bounds. To fill this gap, we leverage the techniques of Nesterov's acceleration and variance reduction to develop novel Byzantine-robust distributed stochastic optimization methods that provably match these lower bounds, up to logarithmic factors, implying that our established lower bounds are tight.

Distributed Stochastic Zeroth-Order Optimization with Compressed Communication

Optimization and Control

Helps computers learn without seeing all the data.

21 Mar 2025 0

88%

Nonconvex Decentralized Stochastic Bilevel Optimization under Heavy-Tailed Noises

Machine Learning (CS)

Teaches computers to learn better with messy data.

19 Sep 2025 0

87%

Generalization Error Analysis for Attack-Free and Byzantine-Resilient Decentralized Learning with Data Heterogeneity

Machine Learning (CS)

Helps computers learn together without sharing private data.

11 Jun 2025 0

View PDF Login to Bookmark

Country of Origin

🇨🇳 China

Page Count

56 pages

Optimal Complexity in Byzantine-Robust Distributed Stochastic Optimization with Data Heterogeneity

Makes computers work better with bad data.

Technical Abstract

Distributed Stochastic Zeroth-Order Optimization with Compressed Communication

Nonconvex Decentralized Stochastic Bilevel Optimization under Heavy-Tailed Noises

Generalization Error Analysis for Attack-Free and Byzantine-Resilient Decentralized Learning with Data Heterogeneity