DQT: Dynamic Quantization Training via Dequantization-Free Nested Integer Arithmetic
Hazem Hesham Yousef Shalby, Fabrizio Pittorino, Francesca Palermo, Diana Trojaniello, Manuel Roveri

TL;DR
DQT introduces a novel integer-only quantization framework enabling efficient, dynamic, instance-based mixed-precision neural network deployment without costly dequantization cycles, significantly improving accuracy and computational efficiency.
Contribution
It proposes a nested integer representation and custom arithmetic to enable on-the-fly bit-width switching, removing the dequantization bottleneck in dynamic quantization.
Findings
Achieves state-of-the-art accuracy on ResNet models.
Reduces transition cost to simple bit-shift operations.
Outperforms static and previous dynamic methods in efficiency.
Abstract
The deployment of deep neural networks on resource-constrained devices relies on quantization. While static, uniform quantization applies a fixed bit-width to all inputs, it fails to adapt to their varying complexity. Dynamic, instance-based mixed-precision quantization promises a superior accuracy-efficiency trade-off by allocating higher precision only when needed. However, a critical bottleneck remains: existing methods require a costly dequantize-to-float and requantize-to-integer cycle to change precision, breaking the integer-only hardware paradigm and compromising performance gains. This paper introduces Dynamic Quantization Training (DQT), a novel framework that removes this bottleneck. At the core of DQT is a nested integer representation where lower-precision values are bit-wise embedded within higher-precision ones. This design, coupled with custom integer-only arithmetic,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
