SigmaQuant: Hardware-Aware Heterogeneous Quantization Method for Edge DNN Inference
Qunyou Liu, Pengbo Yu, Marina Zapater, David Atienza

TL;DR
SigmaQuant is an adaptive, hardware-aware heterogeneous quantization framework that optimizes DNN inference on edge devices by balancing accuracy and resource constraints without extensive search.
Contribution
It introduces SigmaQuant, a novel adaptive layer-wise quantization method that efficiently adapts to hardware conditions, improving over uniform and existing heterogeneous approaches.
Findings
Achieves better accuracy-resource trade-offs on edge devices.
Reduces the need for exhaustive search in quantization design.
Adapts to diverse hardware constraints effectively.
Abstract
Deep neural networks (DNNs) are essential for performing advanced tasks on edge or mobile devices, yet their deployment is often hindered by severe resource constraints, including limited memory, energy, and computational power. While uniform quantization provides a straightforward approach to compress model and reduce hardware requirement, it fails to fully leverage the varying robustness across layers, and often lead to accuracy degradation or suboptimal resource usage, particularly at low bitwidths. In contrast, heterogeneous quantization, which allocates different bitwidths to individual layers, can mitigate these drawbacks. Nonetheless, current heterogeneous quantization methods either needs huge brute-force design space search or lacks the adaptability to meet different hardware conditions, such as memory size, energy budget, and latency requirement. Filling these gaps, this work…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Big Data and Digital Economy · Advanced Data Compression Techniques
