CANITA: Faster Rates for Distributed Convex Optimization with   Communication Compression

Zhize Li; Peter Richt\'arik

arXiv:2107.09461·cs.LG·November 9, 2021·6 cites

CANITA: Faster Rates for Distributed Convex Optimization with Communication Compression

Zhize Li, Peter Richt\'arik

PDF

Open Access 1 Video

TL;DR

CANITA is a novel distributed convex optimization method that combines communication compression with acceleration, achieving faster convergence rates and reducing communication rounds in federated learning scenarios.

Contribution

It introduces CANITA, the first accelerated gradient method with communication compression, improving convergence rates over previous non-accelerated methods like DIANA.

Findings

01

Achieves the first accelerated rate for compressed distributed optimization.

02

Outperforms state-of-the-art non-accelerated methods in convergence speed.

03

Reduces communication rounds significantly in large-scale federated learning.

Abstract

Due to the high communication cost in distributed and federated learning, methods relying on compressed communication are becoming increasingly popular. Besides, the best theoretically and practically performing gradient-type methods invariably rely on some form of acceleration/momentum to reduce the number of communications (faster convergence), e.g., Nesterov's accelerated gradient descent (Nesterov, 1983, 2004) and Adam (Kingma and Ba, 2014). In order to combine the benefits of communication compression and convergence acceleration, we propose a \emph{compressed and accelerated} gradient method based on ANITA (Li, 2021) for distributed optimization, which we call CANITA. Our CANITA achieves the \emph{first accelerated rate} $O ((1 + \frac{ω ^{3}}{n}) \frac{L}{ϵ} + ω (\frac{1}{ϵ})^{\frac{1}{3}})$ , which improves upon the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

CANITA: Faster Rates for Distributed Convex Optimization with Communication Compression· slideslive

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Privacy-Preserving Technologies in Data

MethodsAdam