Quantize Once, Train Fast: Allreduce-Compatible Compression with Provable Guarantees
Jihao Xin, Marco Canini, Peter Richt\'arik, Samuel Horv\'ath

TL;DR
This paper presents Global-QSGD, a novel gradient quantization method compatible with Allreduce, offering theoretical guarantees and practical acceleration for distributed deep learning.
Contribution
We introduce Global-QSGD, an Allreduce-compatible quantization scheme with provable convergence guarantees, addressing limitations of prior heuristic methods.
Findings
Accelerates distributed training by up to 3.51% over baseline methods
Provides rigorous theoretical analysis extending unbiased compressor frameworks
Demonstrates effectiveness across various hardware configurations
Abstract
Distributed training enables large-scale deep learning, but suffers from high communication overhead, especially as models and datasets grow. Gradient compression, particularly quantization, is a promising approach to mitigate this bottleneck. However, existing quantization schemes are often incompatible with Allreduce, the dominant communication primitive in distributed deep learning, and many prior solutions rely on heuristics without theoretical guarantees. We introduce Global-QSGD, an Allreduce-compatible gradient quantization method that leverages global norm scaling to reduce communication overhead while preserving accuracy. Global-QSGD is backed by rigorous theoretical analysis, extending standard unbiased compressor frameworks to establish formal convergence guarantees. Additionally, we develop a performance model to evaluate its impact across different hardware configurations.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Stochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques
