Clustered Sampling: Low-Variance and Improved Representativity for Clients Selection in Federated Learning
Yann Fraboni, Richard Vidal, Laetitia Kameni, Marco Lorenzi

TL;DR
This paper introduces clustered sampling in federated learning to improve client representativity, reduce variance, and enhance training stability without extra client operations, outperforming standard sampling methods.
Contribution
It proposes a novel clustered sampling method for client selection in federated learning, improving convergence and stability while maintaining compatibility with existing privacy and compression techniques.
Findings
Clustered sampling reduces variance in client aggregation weights.
It improves training convergence in non-iid, unbalanced scenarios.
The method seamlessly integrates with standard federated learning frameworks.
Abstract
This work addresses the problem of optimizing communications between server and clients in federated learning (FL). Current sampling approaches in FL are either biased, or non optimal in terms of server-clients communications and training stability. To overcome this issue, we introduce \textit{clustered sampling} for clients selection. We prove that clustered sampling leads to better clients representatitivity and to reduced variance of the clients stochastic aggregation weights in FL. Compatibly with our theory, we provide two different clustering approaches enabling clients aggregation based on 1) sample size, and 2) models similarity. Through a series of experiments in non-iid and unbalanced scenarios, we demonstrate that model aggregation through clustered sampling consistently leads to better training convergence and variability when compared to standard sampling approaches. Our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsPrivacy-Preserving Technologies in Data
