SPARK: Igniting Communication-Efficient Decentralized Learning via Stage-wise Projected NTK and Accelerated Regularization
By: Li Xia
Decentralized federated learning (DFL) faces critical challenges from statistical heterogeneity and communication overhead. While NTK-based methods achieve faster convergence, transmitting full Jacobian matrices is impractical for bandwidth-constrained edge networks. We propose SPARK, synergistically integrating random projection-based Jacobian compression, stage-wise annealed distillation, and Nesterov momentum acceleration. Random projections compress Jacobians while preserving spectral properties essential for convergence. Stage-wise annealed distillation transitions from pure NTK evolution to neighbor-regularized learning, counteracting compression noise. Nesterov momentum accelerates convergence through stable accumulation enabled by distillation smoothing. SPARK achieves 98.7% communication reduction compared to NTK-DFL while maintaining convergence speed and superior accuracy. With momentum, SPARK reaches target performance 3 times faster, establishing state-of-the-art results for communication-efficient decentralized learning and enabling practical deployment in bandwidth-limited edge environments.
Similar Papers
Communication Optimization for Decentralized Learning atop Bandwidth-limited Edge Networks
Networking and Internet Architecture
Makes smart devices learn faster together.
Prediction-space knowledge markets for communication-efficient federated learning on multimedia tasks
Machine Learning (CS)
Lets computers learn from private data without sharing it.
Lightweight Federated Learning in Mobile Edge Computing with Statistical and Device Heterogeneity Awareness
Systems and Control
Makes phones learn together without sharing private data.