Decentralized Learning with Lazy and Approximate Dual Gradients

Yanli Liu; Yuejiao Sun; Wotao Yin

arXiv:2008.01748·math.OC·August 6, 2020·IEEE Trans. Signal Process.·1 cites

Decentralized Learning with Lazy and Approximate Dual Gradients

Yanli Liu, Yuejiao Sun, Wotao Yin

PDF

Open Access

TL;DR

This paper introduces new decentralized learning algorithms that reduce both communication and computation costs by using lazy and approximate dual gradients, leveraging stochastic gradients and local rules for efficiency.

Contribution

The paper proposes simple, effective algorithms that improve upon SSDA and MSDA by reducing communication and computation through lazy updates and approximate dual gradients.

Findings

01

Significant reduction in communication costs.

02

Notable decrease in computational complexity.

03

Algorithms outperform state-of-the-art in experiments.

Abstract

This paper develops algorithms for decentralized machine learning over a network, where data are distributed, computation is localized, and communication is restricted between neighbors. A line of recent research in this area focuses on improving both computation and communication complexities. The methods SSDA and MSDA \cite{scaman2017optimal} have optimal communication complexity when the objective is smooth and strongly convex, and are simple to derive. However, they require solving a subproblem at each step. We propose new algorithms that save computation through using (stochastic) gradients and saves communications when previous information is sufficiently useful. Our methods remain relatively simple -- rather than solving a subproblem, they run Katyusha for a small, fixed number of steps from the latest point. An easy-to-compute, local rule is used to decide if a worker can skip a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Privacy-Preserving Technologies in Data · Sparse and Compressive Sensing Techniques