Enabling Weak Client Participation via On-device Knowledge Distillation in Heterogenous Federated Learning
By: Jihyun Lim , Junhyuk Jo , Tuo Zhang and more
Potential Business Impact:
Trains smart computer programs better using less data.
Online Knowledge Distillation (KD) is recently highlighted to train large models in Federated Learning (FL) environments. Many existing studies adopt the logit ensemble method to perform KD on the server side. However, they often assume that unlabeled data collected at the edge is centralized on the server. Moreover, the logit ensemble method personalizes local models, which can degrade the quality of soft targets, especially when data is highly non-IID. To address these critical limitations,we propose a novel on-device KD-based heterogeneous FL method. Our approach leverages a small auxiliary model to learn from labeled local data. Subsequently, a subset of clients with strong system resources transfers knowledge to a large model through on-device KD using their unlabeled data. Our extensive experiments demonstrate that our on-device KD-based heterogeneous FL method effectively utilizes the system resources of all edge devices as well as the unlabeled data, resulting in higher accuracy compared to SOTA KD-based FL methods.
Similar Papers
HFedCKD: Toward Robust Heterogeneous Federated Learning via Data-free Knowledge Distillation and Two-way Contrast
Machine Learning (CS)
Helps AI learn from more phones, better.
A Novel Algorithm for Personalized Federated Learning: Knowledge Distillation with Weighted Combination Loss
Machine Learning (Stat)
Teaches computers to learn from private data better.
FedSKD: Aggregation-free Model-heterogeneous Federated Learning using Multi-dimensional Similarity Knowledge Distillation
Machine Learning (CS)
Lets computers learn together without sharing private data.