Direct Quantization for Training Highly Accurate Low Bit-width Deep Neural Networks
Tuan Hoang, Thanh-Toan Do, Tam V. Nguyen, Ngai-Man Cheung

TL;DR
This paper introduces two innovative techniques for training low bit-width deep neural networks, enabling direct weight quantization with learnable levels and channel-aware activation quantization, resulting in state-of-the-art accuracy.
Contribution
It presents a novel method for directly updating quantized weights with learnable levels and a channel-aware activation quantization approach, improving low-bit neural network training.
Findings
Achieves state-of-the-art accuracy on CIFAR-100 and ImageNet.
Effective low bit-width training for AlexNet, ResNet, MobileNetV2.
Outperforms existing quantization methods in accuracy.
Abstract
This paper proposes two novel techniques to train deep convolutional neural networks with low bit-width weights and activations. First, to obtain low bit-width weights, most existing methods obtain the quantized weights by performing quantization on the full-precision network weights. However, this approach would result in some mismatch: the gradient descent updates full-precision weights, but it does not update the quantized weights. To address this issue, we propose a novel method that enables {direct} updating of quantized weights {with learnable quantization levels} to minimize the cost function using gradient descent. Second, to obtain low bit-width activations, existing works consider all channels equally. However, the activation quantizers could be biased toward a few channels with high-variance. To address this issue, we propose a method to take into account the quantization…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · COVID-19 diagnosis using AI
MethodsDepthwise Convolution · Residual Connection · Batch Normalization · Bottleneck Residual Block · *Communicated@Fast*How Do I Communicate to Expedia? · Pointwise Convolution · Depthwise Separable Convolution · Inverted Residual Block · Average Pooling · Global Average Pooling
