QSync: Quantization-Minimized Synchronous Distributed Training Across Hybrid Devices
Juntao Zhao, Borui Wan, Yanghua Peng, Haibin Lin, Yibo Zhu, Chuan Wu

TL;DR
QSync is a system that enables efficient synchronous distributed training on hybrid GPU devices by strategically using quantization to minimize accuracy loss and optimize training efficiency across heterogeneous hardware.
Contribution
QSync introduces a novel approach for hybrid device training by combining quantization strategies with a predictor and allocator to optimize accuracy and efficiency.
Findings
Predictor achieves <5% error in simulating mixed-precision training.
QSync improves model accuracy by 0.27-1.03% over uniform precision training.
Efficiently supports various GPU architectures with minimal accuracy degradation.
Abstract
A number of production deep learning clusters have attempted to explore inference hardware for DNN training, at the off-peak serving hours with many inference GPUs idling. Conducting DNN training with a combination of heterogeneous training and inference GPUs, known as hybrid device training, presents considerable challenges due to disparities in compute capability and significant differences in memory capacity. We propose QSync, a training system that enables efficient synchronous data-parallel DNN training over hybrid devices by strategically exploiting quantized operators. According to each device's available resource capacity, QSync selects a quantization-minimized setting for operators in the distributed DNN training graph, minimizing model accuracy degradation but keeping the training efficiency brought by quantization. We carefully design a predictor with a bi-directional…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Memory and Neural Computing · Ferroelectric and Negative Capacitance Devices · Quantum-Dot Cellular Automata
