DEED: A General Quantization Scheme for Communication Efficiency in Bits
Tian Ye, Peijun Xiao, Ruoyu Sun

TL;DR
This paper introduces DEED, a versatile quantization scheme for distributed optimization that reduces communication costs across various settings without sacrificing convergence rates.
Contribution
The paper proposes DEED, a general quantization method applicable to multiple distributed optimization scenarios, achieving lower communication complexity while maintaining convergence.
Findings
DEED achieves $ ilde{O}( rac{ ext{sqrt}( ext{kappa}) ext{log} 1/ ext{epsilon}})$ bits in large-memory settings.
DEED combined with SGD requires $ ilde{O}( ext{kappa} ext{log} 1/ ext{epsilon})$ bits in small-memory settings.
In federated learning, DEED reduces total bits compared to standard federated averaging.
Abstract
In distributed optimization, a popular technique to reduce communication is quantization. In this paper, we provide a general analysis framework for inexact gradient descent that is applicable to quantization schemes. We also propose a quantization scheme Double Encoding and Error Diminishing (DEED). DEED can achieve small communication complexity in three settings: frequent-communication large-memory, frequent-communication small-memory, and infrequent-communication (e.g. federated learning). More specifically, in the frequent-communication large-memory setting, DEED can be easily combined with Nesterov's method, so that the total number of bits required is , where hides numerical constant and factors. In the frequent-communication small-memory setting, DEED combined with SGD only requires $\tilde{O}( \kappa \log…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Privacy-Preserving Technologies in Data
MethodsStochastic Gradient Descent
