Distributed Methods with Absolute Compression and Error Compensation
Marina Danilova, Eduard Gorbunov

TL;DR
This paper advances distributed optimization by analyzing error compensated methods with absolute compression, extending theoretical guarantees to arbitrary sampling and strongly convex problems, and demonstrating improved convergence rates.
Contribution
It generalizes the analysis of error compensated SGD with absolute compression to arbitrary sampling and introduces the first analysis of EC-LSVRG with absolute compression for convex problems.
Findings
Improved convergence rates for EC-SGD with absolute compression under arbitrary sampling.
First theoretical analysis of EC-LSVRG with absolute compression for convex problems.
Numerical experiments confirm the theoretical improvements.
Abstract
Distributed optimization methods are often applied to solving huge-scale problems like training neural networks with millions and even billions of parameters. In such applications, communicating full vectors, e.g., (stochastic) gradients, iterates, is prohibitively expensive, especially when the number of workers is large. Communication compression is a powerful approach to alleviating this issue, and, in particular, methods with biased compression and error compensation are extremely popular due to their practical efficiency. Sahu et al. (2021) propose a new analysis of Error Compensated SGD (EC-SGD) for the class of absolute compression operators showing that in a certain sense, this class contains optimal compressors for EC-SGD. However, the analysis was conducted only under the so-called -bounded noise assumption. In this paper, we generalize the analysis of EC-SGD…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Advanced Bandit Algorithms Research
MethodsStochastic Gradient Descent
