Acceleration for Compressed Gradient Descent in Distributed and Federated Optimization
Zhize Li, Dmitry Kovalev, Xun Qian, Peter Richt\'arik

TL;DR
This paper introduces the first accelerated compressed gradient descent methods for distributed and federated learning, achieving faster convergence rates by combining acceleration with gradient compression.
Contribution
It proposes the first accelerated compressed gradient descent (ACGD) algorithms and their distributed variant ADIANA, with improved theoretical convergence rates over existing methods.
Findings
ACGD achieves faster convergence rates for strongly convex and convex problems.
ADIANA outperforms previous distributed methods like DIANA in convergence speed.
Experimental results confirm the practical efficiency of the proposed accelerated methods.
Abstract
Due to the high communication cost in distributed and federated learning problems, methods relying on compression of communicated messages are becoming increasingly popular. While in other contexts the best performing gradient-type methods invariably rely on some form of acceleration/momentum to reduce the number of iterations, there are no methods which combine the benefits of both gradient compression and acceleration. In this paper, we remedy this situation and propose the first accelerated compressed gradient descent (ACGD) methods. In the single machine regime, we prove that ACGD enjoys the rate for -strongly convex problems and for convex problems, respectively, where is the compression parameter. Our results improve upon the existing non-accelerated…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Privacy-Preserving Technologies in Data
