PSI-PFL: Population Stability Index for Client Selection in non-IID Personalized Federated Learning

Daniel-M. Jimenez-Gutierrez; David Solans; Mohammed Elbamby; Nicolas Kourtellis

arXiv:2506.00440·cs.LG·June 3, 2025

PSI-PFL: Population Stability Index for Client Selection in non-IID Personalized Federated Learning

Daniel-M. Jimenez-Gutierrez, David Solans, Mohammed Elbamby, Nicolas Kourtellis

PDF

Open Access

TL;DR

This paper introduces PSI-PFL, a client selection method for personalized federated learning that uses Population Stability Index to reduce data heterogeneity, leading to improved model accuracy and fairness under non-IID data conditions.

Contribution

We propose PSI-PFL, a novel client selection framework leveraging PSI to effectively mitigate non-IID data issues in federated learning, enhancing accuracy and fairness.

Findings

01

PSI-PFL outperforms existing methods by up to 10% in accuracy.

02

The approach reduces label skew impact in federated learning.

03

Experimental validation across multiple data modalities confirms effectiveness.

Abstract

Federated Learning (FL) enables decentralized machine learning (ML) model training while preserving data privacy by keeping data localized across clients. However, non-independent and identically distributed (non-IID) data across clients poses a significant challenge, leading to skewed model updates and performance degradation. Addressing this, we propose PSI-PFL, a novel client selection framework for Personalized Federated Learning (PFL) that leverages the Population Stability Index (PSI) to quantify and mitigate data heterogeneity (so-called non-IIDness). Our approach selects more homogeneous clients based on PSI, reducing the impact of label skew, one of the most detrimental factors in FL performance. Experimental results over multiple data modalities (tabular, image, text) demonstrate that PSI-PFL significantly improves global model accuracy, outperforming state-of-the-art…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data