Sparse Binary Compression: Towards Distributed Deep Learning with minimal Communication
Felix Sattler, Simon Wiedemann, Klaus-Robert M\"uller, Wojciech Samek

TL;DR
This paper introduces Sparse Binary Compression (SBC), a novel method that drastically reduces communication costs in distributed deep learning by combining gradient sparsification, binarization, and optimal encoding, enabling efficient training with minimal bandwidth.
Contribution
SBC is a new compression framework that significantly lowers communication in distributed training through innovative binarization and encoding techniques, with adaptable sparsity levels.
Findings
Reduces communication by over four orders of magnitude.
Enables training ResNet50 on ImageNet with 3531 times fewer bits.
Maintains comparable convergence speed despite high compression.
Abstract
Currently, progressively larger deep neural networks are trained on ever growing data corpora. As this trend is only going to increase in the future, distributed training schemes are becoming increasingly relevant. A major issue in distributed training is the limited communication bandwidth between contributing nodes or prohibitive communication cost in general. These challenges become even more pressing, as the number of computation nodes increases. To counteract this development we propose sparse binary compression (SBC), a compression framework that allows for a drastic reduction of communication cost for distributed training. SBC combines existing techniques of communication delay and gradient sparsification with a novel binarization method and optimal weight update encoding to push compression gains to new limits. By doing so, our method also allows us to smoothly trade-off…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsGradient Sparsification
