Direct Quantization for Training Highly Accurate Low Bit-width Deep   Neural Networks

Tuan Hoang; Thanh-Toan Do; Tam V. Nguyen; Ngai-Man Cheung

arXiv:2012.13762·cs.CV·December 29, 2020

Direct Quantization for Training Highly Accurate Low Bit-width Deep Neural Networks

Tuan Hoang, Thanh-Toan Do, Tam V. Nguyen, Ngai-Man Cheung

PDF

Open Access

TL;DR

This paper introduces two innovative techniques for training low bit-width deep neural networks, enabling direct weight quantization with learnable levels and channel-aware activation quantization, resulting in state-of-the-art accuracy.

Contribution

It presents a novel method for directly updating quantized weights with learnable levels and a channel-aware activation quantization approach, improving low-bit neural network training.

Findings

01

Achieves state-of-the-art accuracy on CIFAR-100 and ImageNet.

02

Effective low bit-width training for AlexNet, ResNet, MobileNetV2.

03

Outperforms existing quantization methods in accuracy.

Abstract

This paper proposes two novel techniques to train deep convolutional neural networks with low bit-width weights and activations. First, to obtain low bit-width weights, most existing methods obtain the quantized weights by performing quantization on the full-precision network weights. However, this approach would result in some mismatch: the gradient descent updates full-precision weights, but it does not update the quantized weights. To address this issue, we propose a novel method that enables {direct} updating of quantized weights {with learnable quantization levels} to minimize the cost function using gradient descent. Second, to obtain low bit-width activations, existing works consider all channels equally. However, the activation quantizers could be biased toward a few channels with high-variance. To address this issue, we propose a method to take into account the quantization…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · COVID-19 diagnosis using AI

MethodsDepthwise Convolution · Residual Connection · Batch Normalization · Bottleneck Residual Block · *Communicated@Fast*How Do I Communicate to Expedia? · Pointwise Convolution · Depthwise Separable Convolution · Inverted Residual Block · Average Pooling · Global Average Pooling