PSI-PFL: Population Stability Index for Client Selection in non-IID Personalized Federated Learning
By: Daniel-M. Jimenez-Gutierrez , David Solans , Mohammed Elbamby and more
Potential Business Impact:
Makes AI learn better from private, different data.
Federated Learning (FL) enables decentralized machine learning (ML) model training while preserving data privacy by keeping data localized across clients. However, non-independent and identically distributed (non-IID) data across clients poses a significant challenge, leading to skewed model updates and performance degradation. Addressing this, we propose PSI-PFL, a novel client selection framework for Personalized Federated Learning (PFL) that leverages the Population Stability Index (PSI) to quantify and mitigate data heterogeneity (so-called non-IIDness). Our approach selects more homogeneous clients based on PSI, reducing the impact of label skew, one of the most detrimental factors in FL performance. Experimental results over multiple data modalities (tabular, image, text) demonstrate that PSI-PFL significantly improves global model accuracy, outperforming state-of-the-art baselines by up to 10\% under non-IID scenarios while ensuring fairer local performance. PSI-PFL enhances FL performance and offers practical benefits in applications where data privacy and heterogeneity are critical.
Similar Papers
PPFPL: Cross-silo Privacy-preserving Federated Prototype Learning Against Data Poisoning Attacks
Cryptography and Security
Protects private data while training smart computer programs.
FedHiP: Heterogeneity-Invariant Personalized Federated Learning Through Closed-Form Solutions
Machine Learning (CS)
Makes AI learn better even with messy data.
A Survey on Cluster-based Federated Learning
Machine Learning (Stat)
Groups computers to learn better from different data.