Compressing gradients by exploiting temporal correlation in momentum-SGD
Tharindu B. Adikari, Stark C. Draper

TL;DR
This paper introduces compression methods that leverage the temporal correlation in momentum-SGD updates to significantly reduce communication in decentralized optimization, with proven convergence guarantees under expected error bounds.
Contribution
It proposes novel compression techniques exploiting temporal correlation in momentum-SGD and provides convergence analysis under expected error bounds, extending theoretical guarantees.
Findings
Significant reduction in communication rate with negligible computational overhead.
Effective compression methods for systems with and without error-feedback.
Convergence guarantees for SGD with compression under expected error bounds.
Abstract
An increasing bottleneck in decentralized optimization is communication. Bigger models and growing datasets mean that decentralization of computation is important and that the amount of information exchanged is quickly growing. While compression techniques have been introduced to cope with the latter, none has considered leveraging the temporal correlations that exist in consecutive vector updates. An important example is distributed momentum-SGD where temporal correlation is enhanced by the low-pass-filtering effect of applying momentum. In this paper we design and analyze compression methods that exploit temporal correlation in systems both with and without error-feedback. Experiments with the ImageNet dataset demonstrate that our proposed methods offer significant reduction in the rate of communication at only a negligible increase in computation complexity. We further analyze the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsStochastic Gradient Descent
