Refined Convergence and Topology Learning for Decentralized SGD with Heterogeneous Data
Batiste Le Bars, Aur\'elien Bellet, Marc Tommasi, Erick, Lavoie, Anne-Marie Kermarrec

TL;DR
This paper introduces a new analysis of decentralized SGD under data heterogeneity, proposing a data-dependent topology learning method that improves convergence and communication efficiency in federated learning.
Contribution
It identifies neighborhood heterogeneity as a key factor in convergence, and develops a topology learning approach to mitigate data heterogeneity effects.
Findings
Neighborhood heterogeneity significantly impacts D-SGD convergence.
Learning data-dependent topologies improves convergence speed.
Sparse topologies balance communication costs and convergence efficiency.
Abstract
One of the key challenges in decentralized and federated learning is to design algorithms that efficiently deal with highly heterogeneous data distributions across agents. In this paper, we revisit the analysis of the popular Decentralized Stochastic Gradient Descent algorithm (D-SGD) under data heterogeneity. We exhibit the key role played by a new quantity, called neighborhood heterogeneity, on the convergence rate of D-SGD. By coupling the communication topology and the heterogeneity, our analysis sheds light on the poorly understood interplay between these two concepts. We then argue that neighborhood heterogeneity provides a natural criterion to learn data-dependent topologies that reduce (and can even eliminate) the otherwise detrimental effect of data heterogeneity on the convergence time of D-SGD. For the important case of classification with label skew, we formulate the problem…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Privacy-Preserving Technologies in Data · Sparse and Compressive Sensing Techniques
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
