Clustered Sampling: Low-Variance and Improved Representativity for   Clients Selection in Federated Learning

Yann Fraboni; Richard Vidal; Laetitia Kameni; Marco Lorenzi

arXiv:2105.05883·cs.LG·May 24, 2021·47 cites

Clustered Sampling: Low-Variance and Improved Representativity for Clients Selection in Federated Learning

Yann Fraboni, Richard Vidal, Laetitia Kameni, Marco Lorenzi

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces clustered sampling in federated learning to improve client representativity, reduce variance, and enhance training stability without extra client operations, outperforming standard sampling methods.

Contribution

It proposes a novel clustered sampling method for client selection in federated learning, improving convergence and stability while maintaining compatibility with existing privacy and compression techniques.

Findings

01

Clustered sampling reduces variance in client aggregation weights.

02

It improves training convergence in non-iid, unbalanced scenarios.

03

The method seamlessly integrates with standard federated learning frameworks.

Abstract

This work addresses the problem of optimizing communications between server and clients in federated learning (FL). Current sampling approaches in FL are either biased, or non optimal in terms of server-clients communications and training stability. To overcome this issue, we introduce \textit{clustered sampling} for clients selection. We prove that clustered sampling leads to better clients representatitivity and to reduced variance of the clients stochastic aggregation weights in FL. Compatibly with our theory, we provide two different clustering approaches enabling clients aggregation based on 1) sample size, and 2) models similarity. Through a series of experiments in non-iid and unbalanced scenarios, we demonstrate that model aggregation through clustered sampling consistently leads to better training convergence and variability when compared to standard sampling approaches. Our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Accenture/Labs-Federated-Learning
pytorchOfficial

Videos

Clustered Sampling: Low-Variance and Improved Representativity for Clients Selection in Federated Learning· slideslive

Taxonomy

TopicsPrivacy-Preserving Technologies in Data