DQT: Dynamic Quantization Training via Dequantization-Free Nested Integer Arithmetic

Hazem Hesham Yousef Shalby; Fabrizio Pittorino; Francesca Palermo; Diana Trojaniello; Manuel Roveri

arXiv:2508.09176·cs.LG·March 24, 2026

DQT: Dynamic Quantization Training via Dequantization-Free Nested Integer Arithmetic

Hazem Hesham Yousef Shalby, Fabrizio Pittorino, Francesca Palermo, Diana Trojaniello, Manuel Roveri

PDF

TL;DR

DQT introduces a novel integer-only quantization framework enabling efficient, dynamic, instance-based mixed-precision neural network deployment without costly dequantization cycles, significantly improving accuracy and computational efficiency.

Contribution

It proposes a nested integer representation and custom arithmetic to enable on-the-fly bit-width switching, removing the dequantization bottleneck in dynamic quantization.

Findings

01

Achieves state-of-the-art accuracy on ResNet models.

02

Reduces transition cost to simple bit-shift operations.

03

Outperforms static and previous dynamic methods in efficiency.

Abstract

The deployment of deep neural networks on resource-constrained devices relies on quantization. While static, uniform quantization applies a fixed bit-width to all inputs, it fails to adapt to their varying complexity. Dynamic, instance-based mixed-precision quantization promises a superior accuracy-efficiency trade-off by allocating higher precision only when needed. However, a critical bottleneck remains: existing methods require a costly dequantize-to-float and requantize-to-integer cycle to change precision, breaking the integer-only hardware paradigm and compromising performance gains. This paper introduces Dynamic Quantization Training (DQT), a novel framework that removes this bottleneck. At the core of DQT is a nested integer representation where lower-precision values are bit-wise embedded within higher-precision ones. This design, coupled with custom integer-only arithmetic,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.