LQ-Nets: Learned Quantization for Highly Accurate and Compact Deep Neural Networks
Dongqing Zhang, Jiaolong Yang, Dongqiangzi Ye, Gang Hua

TL;DR
LQ-Nets introduces a joint training approach for learned quantizers in deep neural networks, significantly improving accuracy of quantized models across various architectures and datasets, while maintaining compatibility with bit-operations.
Contribution
The paper presents a novel method for jointly training quantized DNNs and their quantizers, outperforming fixed schemes and applicable to arbitrary-bit precision.
Findings
Consistently outperforms previous quantization methods in accuracy.
Effective across multiple network architectures and datasets.
Quantizers are easy to train and compatible with bit-operations.
Abstract
Although weight and activation quantization is an effective approach for Deep Neural Network (DNN) compression and has a lot of potentials to increase inference speed leveraging bit-operations, there is still a noticeable gap in terms of prediction accuracy between the quantized model and the full-precision model. To address this gap, we propose to jointly train a quantized, bit-operation-compatible DNN and its associated quantizers, as opposed to using fixed, handcrafted quantization schemes such as uniform or logarithmic quantization. Our method for learning the quantizers applies to both network weights and activations with arbitrary-bit precision, and our quantizers are easy to train. The comprehensive experiments on CIFAR-10 and ImageNet datasets show that our method works consistently well for various network structures such as AlexNet, VGG-Net, GoogLeNet, ResNet, and DenseNet,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Adversarial Robustness in Machine Learning
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Batch Normalization · Bottleneck Residual Block · Residual Connection · Convolution · Residual Block · Average Pooling · Local Response Normalization · Auxiliary Classifier · Inception Module
