Score: 1

Widening the Network Mitigates the Impact of Data Heterogeneity on FedAvg

Published: August 18, 2025 | arXiv ID: 2508.12576v1

By: Like Jian, Dong Liu

Potential Business Impact:

Makes AI learn better from everyone's private info.

Federated learning (FL) enables decentralized clients to train a model collaboratively without sharing local data. A key distinction between FL and centralized learning is that clients' data are non-independent and identically distributed, which poses significant challenges in training a global model that generalizes well across heterogeneous local data distributions. In this paper, we analyze the convergence of overparameterized FedAvg with gradient descent (GD). We prove that the impact of data heterogeneity diminishes as the width of neural networks increases, ultimately vanishing when the width approaches infinity. In the infinite-width regime, we further prove that both the global and local models in FedAvg behave as linear models, and that FedAvg achieves the same generalization performance as centralized learning with the same number of GD iterations. Extensive experiments validate our theoretical findings across various network architectures, loss functions, and optimization methods.

Country of Origin
🇨🇳 China

Repos / Data Links

Page Count
27 pages

Category
Computer Science:
Machine Learning (CS)