SuperSFL: Resource-Heterogeneous Federated Split Learning with Weight-Sharing Super-Networks
By: Abdullah Al Asif , Sixing Yu , Juan Pablo Munoz and more
Potential Business Impact:
Makes smart devices learn together faster, even if different.
SplitFed Learning (SFL) combines federated learning and split learning to enable collaborative training across distributed edge devices; however, it faces significant challenges in heterogeneous environments with diverse computational and communication capabilities. This paper proposes \textit{SuperSFL}, a federated split learning framework that leverages a weight-sharing super-network to dynamically generate resource-aware client-specific subnetworks, effectively mitigating device heterogeneity. SuperSFL introduces Three-Phase Gradient Fusion (TPGF), an optimization mechanism that coordinates local updates, server-side computation, and gradient fusion to accelerate convergence. In addition, a fault-tolerant client-side classifier and collaborative client--server aggregation enable uninterrupted training under intermittent communication failures. Experimental results on CIFAR-10 and CIFAR-100 with up to 100 heterogeneous clients show that SuperSFL converges $2$--$5\times$ faster in terms of communication rounds than baseline SFL while achieving higher accuracy, resulting in up to $20\times$ lower total communication cost and $13\times$ shorter training time. SuperSFL also demonstrates improved energy efficiency compared to baseline methods, making it a practical solution for federated learning in heterogeneous edge environments.
Similar Papers
Collaborative Split Federated Learning with Parallel Training and Aggregation
Distributed, Parallel, and Cluster Computing
Trains AI faster with smarter teamwork.
Communication-and-Computation Efficient Split Federated Learning: Gradient Aggregation and Resource Management
Distributed, Parallel, and Cluster Computing
Makes AI learn faster with less data sent.
Enhancing Split Learning with Sharded and Blockchain-Enabled SplitFed Approaches
Distributed, Parallel, and Cluster Computing
Makes AI learn safely from many computers.