On the Benefits of Multiple Gossip Steps in Communication-Constrained   Decentralized Optimization

Abolfazl Hashemi; Anish Acharya; Rudrajit Das; Haris Vikalo; Sujay; Sanghavi; Inderjit Dhillon

arXiv:2011.10643·cs.LG·November 24, 2020·5 cites

On the Benefits of Multiple Gossip Steps in Communication-Constrained Decentralized Optimization

Abolfazl Hashemi, Anish Acharya, Rudrajit Das, Haris Vikalo, Sujay, Sanghavi, Inderjit Dhillon

PDF

Open Access 1 Repo

TL;DR

This paper demonstrates that multiple gossip steps in compressed decentralized optimization improve convergence, enabling efficient training of large-scale machine learning models with lossy communication.

Contribution

It provides the first convergence analysis for nonconvex optimization with arbitrary communication compression in decentralized settings.

Findings

01

Multiple gossip steps accelerate convergence in compressed decentralized optimization.

02

Convergence to within ε of the optimum is achieved with O(log(1/ε)) iterations and gossip steps.

03

Results apply to both non-convex and strongly convex objectives.

Abstract

In decentralized optimization, it is common algorithmic practice to have nodes interleave (local) gradient descent iterations with gossip (i.e. averaging over the network) steps. Motivated by the training of large-scale machine learning models, it is also increasingly common to require that messages be {\em lossy compressed} versions of the local parameters. In this paper, we show that, in such compressed decentralized optimization settings, there are benefits to having {\em multiple} gossip steps between subsequent gradient iterations, even when the cost of doing so is appropriately accounted for e.g. by means of reducing the precision of compressed information. In particular, we show that having $O (lo g \frac{1}{ϵ})$ gradient iterations {with constant step size} - and $O (lo g \frac{1}{ϵ})$ gossip steps between every pair of these iterations - enables convergence to within…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

anishacharya/DeLiCoCo
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDistributed Control Multi-Agent Systems · Stochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques