Distributed learning with compressed gradients

Sarit Khirirat; Hamid Reza Feyzmahdavian; Mikael Johansson

arXiv:1806.06573·math.OC·November 30, 2018·39 cites

Distributed learning with compressed gradients

Sarit Khirirat, Hamid Reza Feyzmahdavian, Mikael Johansson

PDF

Open Access

TL;DR

This paper develops a unified theoretical framework for analyzing distributed gradient methods that use asynchronous updates and gradient compression, providing explicit convergence bounds and demonstrating fast convergence with limited communication.

Contribution

It introduces a comprehensive analysis framework for distributed gradient methods with staled and compressed gradients, deriving explicit convergence bounds and characterizing the effects of asynchrony and compression.

Findings

01

Explicit convergence bounds for compressed gradient algorithms

02

Trade-offs between asynchrony, compression accuracy, and convergence speed

03

Numerical validation of fast convergence with limited communication

Abstract

Asynchronous computation and gradient compression have emerged as two key techniques for achieving scalability in distributed optimization for large-scale machine learning. This paper presents a unified analysis framework for distributed gradient methods operating with staled and compressed gradients. Non-asymptotic bounds on convergence rates and information exchange are derived for several optimization algorithms. These bounds give explicit expressions for step-sizes and characterize how the amount of asynchrony and the compression accuracy affect iteration and communication complexity guarantees. Numerical results highlight convergence properties of different gradient compression algorithms and confirm that fast convergence under limited information exchange is indeed possible.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Distributed Control Multi-Agent Systems