On Distributed Adaptive Optimization with Gradient Compression

Xiaoyun Li; Belhal Karimi; Ping Li

arXiv:2205.05632·stat.ML·May 12, 2022·1 cites

On Distributed Adaptive Optimization with Gradient Compression

Xiaoyun Li, Belhal Karimi, Ping Li

PDF

Open Access

TL;DR

This paper introduces COMP-AMS, a distributed adaptive optimization framework that uses gradient compression with error feedback, achieving communication efficiency without sacrificing convergence speed or accuracy.

Contribution

It presents a simple, effective distributed adaptive optimization method with gradient compression, maintaining convergence rates and test accuracy while reducing communication costs.

Findings

01

COMP-AMS achieves the same convergence rate as standard AMSGrad.

02

The method exhibits linear speedup with the number of workers.

03

Numerical experiments confirm reduced communication with maintained accuracy.

Abstract

We study COMP-AMS, a distributed optimization framework based on gradient averaging and adaptive AMSGrad algorithm. Gradient compression with error feedback is applied to reduce the communication cost in the gradient transmission process. Our convergence analysis of COMP-AMS shows that such compressed gradient averaging strategy yields same convergence rate as standard AMSGrad, and also exhibits the linear speedup effect w.r.t. the number of local workers. Compared with recently proposed protocols on distributed adaptive methods, COMP-AMS is simple and convenient. Numerical experiments are conducted to justify the theoretical findings, and demonstrate that the proposed method can achieve same test accuracy as the full-gradient AMSGrad with substantial communication savings. With its simplicity and efficiency, COMP-AMS can serve as a useful distributed training framework for adaptive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDistributed Control Multi-Agent Systems · Stochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques

MethodsAMSGrad