Score: 1

Enabling Weak Client Participation via On-device Knowledge Distillation in Heterogenous Federated Learning

Published: March 14, 2025 | arXiv ID: 2503.11151v2

By: Jihyun Lim , Junhyuk Jo , Tuo Zhang and more

Potential Business Impact:

Trains smart computer programs better using less data.

Business Areas:
Crowdsourcing Collaboration

Online Knowledge Distillation (KD) is recently highlighted to train large models in Federated Learning (FL) environments. Many existing studies adopt the logit ensemble method to perform KD on the server side. However, they often assume that unlabeled data collected at the edge is centralized on the server. Moreover, the logit ensemble method personalizes local models, which can degrade the quality of soft targets, especially when data is highly non-IID. To address these critical limitations,we propose a novel on-device KD-based heterogeneous FL method. Our approach leverages a small auxiliary model to learn from labeled local data. Subsequently, a subset of clients with strong system resources transfers knowledge to a large model through on-device KD using their unlabeled data. Our extensive experiments demonstrate that our on-device KD-based heterogeneous FL method effectively utilizes the system resources of all edge devices as well as the unlabeled data, resulting in higher accuracy compared to SOTA KD-based FL methods.

Country of Origin
🇰🇷 Korea, Republic of

Page Count
10 pages

Category
Computer Science:
Machine Learning (CS)