LQ-Nets: Learned Quantization for Highly Accurate and Compact Deep   Neural Networks

Dongqing Zhang; Jiaolong Yang; Dongqiangzi Ye; Gang Hua

arXiv:1807.10029·cs.CV·July 27, 2018·45 cites

LQ-Nets: Learned Quantization for Highly Accurate and Compact Deep Neural Networks

Dongqing Zhang, Jiaolong Yang, Dongqiangzi Ye, Gang Hua

PDF

Open Access 1 Repo

TL;DR

LQ-Nets introduces a joint training approach for learned quantizers in deep neural networks, significantly improving accuracy of quantized models across various architectures and datasets, while maintaining compatibility with bit-operations.

Contribution

The paper presents a novel method for jointly training quantized DNNs and their quantizers, outperforming fixed schemes and applicable to arbitrary-bit precision.

Findings

01

Consistently outperforms previous quantization methods in accuracy.

02

Effective across multiple network architectures and datasets.

03

Quantizers are easy to train and compatible with bit-operations.

Abstract

Although weight and activation quantization is an effective approach for Deep Neural Network (DNN) compression and has a lot of potentials to increase inference speed leveraging bit-operations, there is still a noticeable gap in terms of prediction accuracy between the quantized model and the full-precision model. To address this gap, we propose to jointly train a quantized, bit-operation-compatible DNN and its associated quantizers, as opposed to using fixed, handcrafted quantization schemes such as uniform or logarithmic quantization. Our method for learning the quantizers applies to both network weights and activations with arbitrary-bit precision, and our quantizers are easy to train. The comprehensive experiments on CIFAR-10 and ImageNet datasets show that our method works consistently well for various network structures such as AlexNet, VGG-Net, GoogLeNet, ResNet, and DenseNet,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Microsoft/LQ-Nets
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Adversarial Robustness in Machine Learning

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Batch Normalization · Bottleneck Residual Block · Residual Connection · Convolution · Residual Block · Average Pooling · Local Response Normalization · Auxiliary Classifier · Inception Module