Efficient Distribution Similarity Identification in Clustered Federated   Learning via Principal Angles Between Client Data Subspaces

Saeed Vahidian; Mahdi Morafah; Weijia Wang; Vyacheslav Kungurtsev,; Chen Chen; Mubarak Shah; and Bill Lin

arXiv:2209.10526·cs.LG·September 22, 2022·5 cites

Efficient Distribution Similarity Identification in Clustered Federated Learning via Principal Angles Between Client Data Subspaces

Saeed Vahidian, Mahdi Morafah, Weijia Wang, Vyacheslav Kungurtsev,, Chen Chen, Mubarak Shah, and Bill Lin

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a novel clustered federated learning method that efficiently identifies client data distribution similarities by analyzing principal angles between client data subspaces, enabling faster and more effective clustering.

Contribution

The paper proposes a direct, single-shot approach using principal angles and SVD to identify client data similarities, improving clustering efficiency in federated learning.

Findings

01

Enables rapid clustering based on data distribution similarities.

02

Provides convergence guarantees for non-convex federated learning objectives.

03

Addresses broad data heterogeneity issues beyond label skew.

Abstract

Clustered federated learning (FL) has been shown to produce promising results by grouping clients into clusters. This is especially effective in scenarios where separate groups of clients have significant differences in the distributions of their local data. Existing clustered FL algorithms are essentially trying to group together clients with similar distributions so that clients in the same cluster can leverage each other's data to better perform federated learning. However, prior clustered FL algorithms attempt to learn these distribution similarities indirectly during training, which can be quite time consuming as many rounds of federated learning may be required until the formation of clusters is stabilized. In this paper, we propose a new approach to federated learning that directly aims to efficiently identify distribution similarities among clients by analyzing the principal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mmorafah/pacfl
pytorchOfficial

Videos

Efficient Distribution Similarity Identification in Clustered Federated Learning via Principal Angles Between Client Data Subspaces· underline

Taxonomy

TopicsPrivacy-Preserving Technologies in Data