Widening the Network Mitigates the Impact of Data Heterogeneity on FedAvg
By: Like Jian, Dong Liu
Potential Business Impact:
Makes AI learn better from everyone's private info.
Federated learning (FL) enables decentralized clients to train a model collaboratively without sharing local data. A key distinction between FL and centralized learning is that clients' data are non-independent and identically distributed, which poses significant challenges in training a global model that generalizes well across heterogeneous local data distributions. In this paper, we analyze the convergence of overparameterized FedAvg with gradient descent (GD). We prove that the impact of data heterogeneity diminishes as the width of neural networks increases, ultimately vanishing when the width approaches infinity. In the infinite-width regime, we further prove that both the global and local models in FedAvg behave as linear models, and that FedAvg achieves the same generalization performance as centralized learning with the same number of GD iterations. Extensive experiments validate our theoretical findings across various network architectures, loss functions, and optimization methods.
Similar Papers
Local Performance vs. Out-of-Distribution Generalization: An Empirical Analysis of Personalized Federated Learning in Heterogeneous Data Environments
Machine Learning (CS)
Helps AI learn better from different data.
FedQuad: Federated Stochastic Quadruplet Learning to Mitigate Data Heterogeneity
Machine Learning (CS)
Makes AI learn better from many different computers.
Federated Learning in the Wild: A Comparative Study for Cybersecurity under Non-IID and Unbalanced Settings
Cryptography and Security
Helps computers find online attacks without sharing private data.