LSAM: Asynchronous Distributed Training with Landscape-Smoothed Sharpness-Aware Minimization
By: Yunfei Teng, Sixin Zhang
Potential Business Impact:
Makes AI learn better and faster.
While Sharpness-Aware Minimization (SAM) improves generalization in deep neural networks by minimizing both loss and sharpness, it suffers from inefficiency in distributed large-batch training. We present Landscape-Smoothed SAM (LSAM), a novel optimizer that preserves SAM's generalization advantages while offering superior efficiency. LSAM integrates SAM's adversarial steps with an asynchronous distributed sampling strategy, generating an asynchronous distributed sampling scheme, producing a smoothed sharpness-aware loss landscape for optimization. This design eliminates synchronization bottlenecks, accelerates large-batch convergence, and delivers higher final accuracy compared to data-parallel SAM.
Similar Papers
Asynchronous Sharpness-Aware Minimization For Fast and Accurate Deep Learning
Machine Learning (CS)
Makes smart computer programs learn faster and better.
Bi-LoRA: Efficient Sharpness-Aware Minimization for Fine-Tuning Large-Scale Models
Machine Learning (CS)
Makes AI learn better with less data.
Focal-SAM: Focal Sharpness-Aware Minimization for Long-Tailed Classification
Machine Learning (CS)
Teaches computers to learn from rare examples.