Communication-Efficient Stochastic Distributed Learning
Xiaoxing Ren, Nicola Bastianello, Karl H. Johansson, Thomas Parisini

TL;DR
This paper introduces a communication-efficient distributed learning algorithm using stochastic gradients and local training, applicable to convex and nonconvex problems, with proven convergence and variance reduction enhancements.
Contribution
It proposes a novel ADMM-based algorithm that reduces communication costs and accelerates convergence through local training and stochastic gradients, including a variance reduction variant.
Findings
Converges to a neighborhood of stationary points for nonconvex problems.
Achieves exact convergence with variance reduction.
Local training accelerates convergence speed.
Abstract
We address distributed learning problems, both nonconvex and convex, over undirected networks. In particular, we design a novel algorithm based on the distributed Alternating Direction Method of Multipliers (ADMM) to address the challenges of high communication costs, and large datasets. Our design tackles these challenges i) by enabling the agents to perform multiple local training steps between each round of communications; and ii) by allowing the agents to employ stochastic gradients while carrying out local computations. We show that the proposed algorithm converges to a neighborhood of a stationary point, for nonconvex problems, and of an optimal point, for convex problems. We also propose a variant of the algorithm to incorporate variance reduction thus achieving exact convergence. We show that the resulting algorithm indeed converges to a stationary (or optimal) point, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
