Federated Stochastic Minimax Optimization under Heavy-Tailed Noises
By: Xinwen Zhang, Hongchang Gao
Potential Business Impact:
Helps computers learn better with messy data.
Heavy-tailed noise has attracted growing attention in nonconvex stochastic optimization, as numerous empirical studies suggest it offers a more realistic assumption than standard bounded variance assumption. In this work, we investigate nonconvex-PL minimax optimization under heavy-tailed gradient noise in federated learning. We propose two novel algorithms: Fed-NSGDA-M, which integrates normalized gradients, and FedMuon-DA, which leverages the Muon optimizer for local updates. Both algorithms are designed to effectively address heavy-tailed noise in federated minimax optimization, under a milder condition. We theoretically establish that both algorithms achieve a convergence rate of $O({1}/{(TNp)^{\frac{s-1}{2s}}})$. To the best of our knowledge, these are the first federated minimax optimization algorithms with rigorous theoretical guarantees under heavy-tailed noise. Extensive experiments further validate their effectiveness.
Similar Papers
Decentralized Nonconvex Optimization under Heavy-Tailed Noise: Normalization and Optimal Convergence
Optimization and Control
Helps computers learn better with messy data.
Nonconvex Decentralized Stochastic Bilevel Optimization under Heavy-Tailed Noises
Machine Learning (CS)
Teaches computers to learn better with messy data.
Stochastic Bilevel Optimization with Heavy-Tailed Noise
Machine Learning (CS)
Teaches computers to learn better from messy data.