Score: 1

Resource Utilization Optimized Federated Learning

Published: March 10, 2025 | arXiv ID: 2504.13850v1

By: Zihan Zhang, Leon Wong, Blesson Varghese

Potential Business Impact:

Makes smart computer learning faster and better.

Business Areas:

Cloud Computing Internet Services, Software

Federated learning (FL) systems facilitate distributed machine learning across a server and multiple devices. However, FL systems have low resource utilization limiting their practical use in the real world. This inefficiency primarily arises from two types of idle time: (i) task dependency between the server and devices, and (ii) stragglers among heterogeneous devices. This paper introduces FedOptima, a resource-optimized FL system designed to simultaneously minimize both types of idle time; existing systems do not eliminate or reduce both at the same time. FedOptima offloads the training of certain layers of a neural network from a device to server using three innovations. First, devices operate independently of each other using asynchronous aggregation to eliminate straggler effects, and independently of the server by utilizing auxiliary networks to minimize idle time caused by task dependency. Second, the server performs centralized training using a task scheduler that ensures balanced contributions from all devices, improving model accuracy. Third, an efficient memory management mechanism on the server increases scalability of the number of participating devices. Four state-of-the-art offloading-based and asynchronous FL methods are chosen as baselines. Experimental results show that compared to the best results of the baselines on convolutional neural networks and transformers on multiple lab-based testbeds, FedOptima (i) achieves higher or comparable accuracy, (ii) accelerates training by 1.9x to 21.8x, (iii) reduces server and device idle time by up to 93.9% and 81.8%, respectively, and (iv) increases throughput by 1.1x to 2.0x.

Federated Learning within Global Energy Budget over Heterogeneous Edge Accelerators

Distributed, Parallel, and Cluster Computing

Trains AI smarter with less energy.

12 Jun 2025 1

91%

Communication-Efficient Device Scheduling for Federated Learning Using Lyapunov Optimization

Machine Learning (CS)

Makes smart devices learn faster without sharing data.

1 Mar 2025 1

90%

Enhancing Communication Efficiency in FL with Adaptive Gradient Quantization and Communication Frequency Optimization

Distributed, Parallel, and Cluster Computing

Makes phones train AI without sharing private info.

27 Sep 2025 0

View PDF Login to Bookmark

Country of Origin

🇬🇧 United Kingdom

Page Count

12 pages

Resource Utilization Optimized Federated Learning

Makes smart computer learning faster and better.

Technical Abstract

Federated Learning within Global Energy Budget over Heterogeneous Edge Accelerators

Communication-Efficient Device Scheduling for Federated Learning Using Lyapunov Optimization

Enhancing Communication Efficiency in FL with Adaptive Gradient Quantization and Communication Frequency Optimization