Clustered Federated Learning via Embedding Distributions
Dekai Zhang, Matthew Williams, Francesca Toni

TL;DR
This paper introduces EMD-CFL, a novel one-shot clustering method for federated learning that uses Earth Mover's distance to group clients with similar data distributions, improving performance on non-IID data.
Contribution
The paper presents a new one-shot clustering approach for federated learning based on Earth Mover's distance, with theoretical motivation and extensive empirical validation.
Findings
Superior clustering performance over 16 baselines
Effective handling of non-IID data in federated learning
Theoretically motivated by domain adaptation results
Abstract
Federated learning (FL) is a widely used framework for machine learning in distributed data environments where clients hold data that cannot be easily centralised, such as for data protection reasons. FL, however, is known to be vulnerable to non-IID data. Clustered FL addresses this issue by finding more homogeneous clusters of clients. We propose a novel one-shot clustering method, EMD-CFL, using the Earth Mover's distance (EMD) between data distributions in embedding space. We theoretically motivate the use of EMDs using results from the domain adaptation literature and demonstrate empirically superior clustering performance in extensive comparisons against 16 baselines and on a range of challenging datasets.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Advanced Graph Neural Networks · Machine Learning in Healthcare
