Clustered Federated Learning via Embedding Distributions
By: Dekai Zhang, Matthew Williams, Francesca Toni
Potential Business Impact:
Groups similar data for smarter computer learning.
Federated learning (FL) is a widely used framework for machine learning in distributed data environments where clients hold data that cannot be easily centralised, such as for data protection reasons. FL, however, is known to be vulnerable to non-IID data. Clustered FL addresses this issue by finding more homogeneous clusters of clients. We propose a novel one-shot clustering method, EMD-CFL, using the Earth Mover's distance (EMD) between data distributions in embedding space. We theoretically motivate the use of EMDs using results from the domain adaptation literature and demonstrate empirically superior clustering performance in extensive comparisons against 16 baselines and on a range of challenging datasets.
Similar Papers
Redefining non-IID Data in Federated Learning for Computer Vision Tasks: Migrating from Labels to Embeddings for Task-Specific Data Distributions
CV and Pattern Recognition
Helps computers learn together without sharing private data.
One-Shot Clustering for Federated Learning
Machine Learning (CS)
Finds best time to group devices for learning.
LCFed: An Efficient Clustered Federated Learning Framework for Heterogeneous Data
Machine Learning (CS)
Makes AI learn better from different groups of data.