Clustering-Based User Selection in Federated Learning: Metadata Exploitation for 3GPP Networks
By: Ce Zheng , Shiyao Ma , Ke Zhang and more
Potential Business Impact:
Helps AI learn from many people without seeing their private info.
Federated learning (FL) enables collaborative model training without sharing raw user data, but conventional simulations often rely on unrealistic data partitioning and current user selection methods ignore data correlation among users. To address these challenges, this paper proposes a metadatadriven FL framework. We first introduce a novel data partition model based on a homogeneous Poisson point process (HPPP), capturing both heterogeneity in data quantity and natural overlap among user datasets. Building on this model, we develop a clustering-based user selection strategy that leverages metadata, such as user location, to reduce data correlation and enhance label diversity across training rounds. Extensive experiments on FMNIST and CIFAR-10 demonstrate that the proposed framework improves model performance, stability, and convergence in non-IID scenarios, while maintaining comparable performance under IID settings. Furthermore, the method shows pronounced advantages when the number of selected users per round is small. These findings highlight the framework's potential for enhancing FL performance in realistic deployments and guiding future standardization.
Similar Papers
Hybrid Federated Learning for Noise-Robust Training
Machine Learning (CS)
Helps phones learn together without sharing private info.
Robust Federated Learning in Unreliable Wireless Networks: A Client Selection Approach
Distributed, Parallel, and Cluster Computing
Fixes AI learning when internet is bad.
Client Selection in Federated Learning with Data Heterogeneity and Network Latencies
Machine Learning (CS)
Makes smart computers learn faster from different data.