Differentiable Dynamic Quantization with Mixed Precision and Adaptive Resolution
Zhang Zhaoyang, Shao Wenqi, Gu Jinwei, Wang Xiaogang, Luo Ping

TL;DR
This paper introduces Differentiable Dynamic Quantization (DDQ), a fully learnable method for quantizing neural networks that adapts precision and resolution per layer, outperforming prior methods especially on lightweight models like MobileNets.
Contribution
The paper proposes DDQ, a novel differentiable approach to learn quantization hyperparameters, enabling efficient and hardware-friendly low-precision neural network deployment.
Findings
DDQ achieves lossless 4-bit quantization for MobileNetV2 on ImageNet.
DDQ outperforms prior quantization methods on various networks and benchmarks.
DDQ is capable of quantizing challenging lightweight architectures effectively.
Abstract
Model quantization is challenging due to many tedious hyper-parameters such as precision (bitwidth), dynamic range (minimum and maximum discrete values) and stepsize (interval between discrete values). Unlike prior arts that carefully tune these values, we present a fully differentiable approach to learn all of them, named Differentiable Dynamic Quantization (DDQ), which has several benefits. (1) DDQ is able to quantize challenging lightweight architectures like MobileNets, where different layers prefer different quantization parameters. (2) DDQ is hardware-friendly and can be easily implemented using low-precision matrix-vector multiplication, making it capable in many hardware such as ARM. (3) Extensive experiments show that DDQ outperforms prior arts on many networks and benchmarks, especially when models are already efficient and compact. e.g., DDQ is the first approach that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Brain Tumor Detection and Classification
MethodsBatch Normalization · Convolution · Depthwise Convolution · Average Pooling · 1x1 Convolution · Pointwise Convolution · Depthwise Separable Convolution · Inverted Residual Block
