D-Cliques: Compensating for Data Heterogeneity with Topology in   Decentralized Federated Learning

Aur\'elien Bellet; Anne-Marie Kermarrec; Erick Lavoie

arXiv:2104.07365·cs.LG·November 5, 2021

D-Cliques: Compensating for Data Heterogeneity with Topology in Decentralized Federated Learning

Aur\'elien Bellet, Anne-Marie Kermarrec, Erick Lavoie

PDF

Open Access

TL;DR

This paper introduces D-Cliques, a topology design for decentralized federated learning that reduces data heterogeneity effects by grouping nodes into representative cliques, improving convergence speed and communication efficiency.

Contribution

The paper proposes D-Cliques, a novel topology that mitigates label distribution skew in decentralized federated learning, with an adaptive gradient update method and empirical validation.

Findings

01

D-Cliques achieve similar convergence as fully-connected topologies.

02

Significant reduction in communication overhead with 98% fewer edges.

03

Effective in heterogeneous data settings on MNIST and CIFAR10.

Abstract

The convergence speed of machine learning models trained with Federated Learning is significantly affected by heterogeneous data partitions, even more so in a fully decentralized setting without a central server. In this paper, we show that the impact of label distribution skew, an important type of data heterogeneity, can be significantly reduced by carefully designing the underlying communication topology. We present D-Cliques, a novel topology that reduces gradient bias by grouping nodes in sparsely interconnected cliques such that the label distribution in a clique is representative of the global label distribution. We also show how to adapt the updates of decentralized SGD to obtain unbiased gradients and implement an effective momentum with D-Cliques. Our extensive empirical evaluation on MNIST and CIFAR10 demonstrates that our approach provides similar convergence speed as a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Stochastic Gradient Optimization Techniques · Advanced Graph Neural Networks

MethodsStochastic Gradient Descent