vqSGD: Vector Quantized Stochastic Gradient Descent

Venkata Gandikota; Daniel Kane; Raj Kumar Maity; Arya Mazumdar

arXiv:1911.07971·cs.LG·December 29, 2020

vqSGD: Vector Quantized Stochastic Gradient Descent

Venkata Gandikota, Daniel Kane, Raj Kumar Maity, Arya Mazumdar

PDF

TL;DR

This paper introduces vqSGD, a family of vector quantization schemes for distributed optimization that significantly reduce communication costs while maintaining convergence guarantees, leveraging information theory and error-correcting codes.

Contribution

The paper proposes novel vector quantization schemes for stochastic gradient descent that are near optimal, communication-efficient, and provide privacy guarantees.

Findings

01

Requires o(d) bits for gradient estimation

02

Achieves asymptotic reduction in communication cost

03

Provides convergence guarantees and privacy benefits

Abstract

In this work, we present a family of vector quantization schemes \emph{vqSGD} (Vector-Quantized Stochastic Gradient Descent) that provide an asymptotic reduction in the communication cost with convergence guarantees in first-order distributed optimization. In the process we derive the following fundamental information theoretic fact: $Θ (\frac{d}{R ^{2}})$ bits are necessary and sufficient to describe an unbiased estimator $\overset{g}{^} (g)$ for any $g$ in the $d$ -dimensional unit sphere, under the constraint that $∥ \overset{g}{^} (g) ∥_{2} \leq R$ almost surely. In particular, we consider a randomized scheme based on the convex hull of a point set, that returns an unbiased estimator of a $d$ -dimensional gradient vector with almost surely bounded norm. We provide multiple efficient instances of our scheme, that are near optimal, and require only $o (d)$ bits of communication at the expense of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.