Score: 1

FedDiverse: Tackling Data Heterogeneity in Federated Learning with Diversity-Driven Client Selection

Published: April 15, 2025 | arXiv ID: 2504.11216v2

By: Gergely D. Németh , Eros Fanì , Yeat Jeng Ng and more

Potential Business Impact:

Helps AI learn better from different data.

Business Areas:

Facial Recognition Data and Analytics, Software

Federated Learning (FL) enables decentralized training of machine learning models on distributed data while preserving privacy. However, in real-world FL settings, client data is often non-identically distributed and imbalanced, resulting in statistical data heterogeneity which impacts the generalization capabilities of the server's model across clients, slows convergence and reduces performance. In this paper, we address this challenge by proposing first a characterization of statistical data heterogeneity by means of 6 metrics of global and client attribute imbalance, class imbalance, and spurious correlations. Next, we create and share 7 computer vision datasets for binary and multiclass image classification tasks in Federated Learning that cover a broad range of statistical data heterogeneity and hence simulate real-world situations. Finally, we propose FEDDIVERSE, a novel client selection algorithm in FL which is designed to manage and leverage data heterogeneity across clients by promoting collaboration between clients with complementary data distributions. Experiments on the seven proposed FL datasets demonstrate FEDDIVERSE's effectiveness in enhancing the performance and robustness of a variety of FL methods while having low communication and computational overhead.

Client Selection in Federated Learning with Data Heterogeneity and Network Latencies

Machine Learning (CS)

Makes smart computers learn faster from different data.

2 Apr 2025 0

91%

FedDPC : Handling Data Heterogeneity and Partial Client Participation in Federated Learning

Machine Learning (CS)

Improves AI learning from many computers.

23 Dec 2025 0

91%

FedQuad: Federated Stochastic Quadruplet Learning to Mitigate Data Heterogeneity

Machine Learning (CS)

Makes AI learn better from many different computers.

4 Sep 2025 0

View PDF Login to Bookmark

Country of Origin

🇮🇹 🇪🇸 🇬🇧 Spain, Italy, United Kingdom

Page Count

9 pages

FedDiverse: Tackling Data Heterogeneity in Federated Learning with Diversity-Driven Client Selection

Helps AI learn better from different data.

Technical Abstract

Client Selection in Federated Learning with Data Heterogeneity and Network Latencies

FedDPC : Handling Data Heterogeneity and Partial Client Participation in Federated Learning

FedQuad: Federated Stochastic Quadruplet Learning to Mitigate Data Heterogeneity