Enhancing Parallelism in Decentralized Stochastic Convex Optimization
By: Ofri Eisen, Ron Dorfman, Kfir Y. Levy
Potential Business Impact:
Lets more computers learn together faster.
Decentralized learning has emerged as a powerful approach for handling large datasets across multiple machines in a communication-efficient manner. However, such methods often face scalability limitations, as increasing the number of machines beyond a certain point negatively impacts convergence rates. In this work, we propose Decentralized Anytime SGD, a novel decentralized learning algorithm that significantly extends the critical parallelism threshold, enabling the effective use of more machines without compromising performance. Within the stochastic convex optimization (SCO) framework, we establish a theoretical upper bound on parallelism that surpasses the current state-of-the-art, allowing larger networks to achieve favorable statistical guarantees and closing the gap with centralized learning in highly connected topologies.
Similar Papers
Scaling Up Data Parallelism in Decentralized Deep Learning
Machine Learning (CS)
Makes AI learn faster on many computers.
Asynchronous Decentralized SGD under Non-Convexity: A Block-Coordinate Descent Framework
Machine Learning (CS)
Helps computers learn together faster, even with slow connections.
Decentralized Optimization with Amplified Privacy via Efficient Communication
Systems and Control
Keeps secret messages safe while learning.