Bandwidth-Aware Network Topology Optimization for Decentralized Learning
By: Yipeng Shen , Zehan Zhu , Yan Huang and more
Potential Business Impact:
Makes computer learning faster by improving network connections.
Network topology is critical for efficient parameter synchronization in distributed learning over networks. However, most existing studies do not account for bandwidth limitations in network topology design. In this paper, we propose a bandwidth-aware network topology optimization framework to maximize consensus speed under edge cardinality constraints. For heterogeneous bandwidth scenarios, we introduce a maximum bandwidth allocation strategy for the edges to ensure efficient communication among nodes. By reformulating the problem into an equivalent Mixed-Integer SDP problem, we leverage a computationally efficient ADMM-based method to obtain topologies that yield the maximum consensus speed. Within the ADMM substep, we adopt the conjugate gradient method to efficiently solve large-scale linear equations to achieve better scalability. Experimental results demonstrate that the resulting network topologies outperform the benchmark topologies in terms of consensus speed, and reduce the training time required for decentralized learning tasks on real-world datasets to achieve the target test accuracy, exhibiting speedups of more than $1.11\times$ and $1.21\times$ for homogeneous and heterogeneous bandwidth settings, respectively.
Similar Papers
Delay-Tolerant Augmented-Consensus-based Distributed Directed Optimization
Systems and Control
Fixes slow computer networks for faster learning.
Towards Heterogeneity-Aware and Energy-Efficient Topology Optimization for Decentralized Federated Learning in Edge Environment
Machine Learning (CS)
Helps many devices train AI without sharing private data.
Communication Optimization for Decentralized Learning atop Bandwidth-limited Edge Networks
Networking and Internet Architecture
Makes smart devices learn faster together.