Bridging the Accuracy Gap for 2-bit Quantized Neural Networks (QNN)

Jungwook Choi; Pierce I-Jen Chuang; Zhuo Wang; Swagath Venkataramani,; Vijayalakshmi Srinivasan; Kailash Gopalakrishnan

arXiv:1807.06964·cs.CV·July 19, 2018·38 cites

Bridging the Accuracy Gap for 2-bit Quantized Neural Networks (QNN)

Jungwook Choi, Pierce I-Jen Chuang, Zhuo Wang, Swagath Venkataramani,, Vijayalakshmi Srinivasan, Kailash Gopalakrishnan

PDF

Open Access

TL;DR

This paper introduces novel techniques for 2-bit quantization of neural networks, achieving accuracy comparable to full-precision models by separately optimizing weight and activation quantizations.

Contribution

It presents PACT and SAWB, two methods for activation and weight quantization, respectively, that together enable high-accuracy 2-bit QNNs without exhaustive search.

Findings

01

Achieves state-of-the-art accuracy for 2-bit QNNs.

02

PACT optimizes activation clipping during training.

03

SAWB minimizes quantization error based on weight statistics.

Abstract

Deep learning algorithms achieve high classification accuracy at the expense of significant computation cost. In order to reduce this cost, several quantization schemes have gained attention recently with some focusing on weight quantization, and others focusing on quantizing activations. This paper proposes novel techniques that target weight and activation quantizations separately resulting in an overall quantized neural network (QNN). The activation quantization technique, PArameterized Clipping acTivation (PACT), uses an activation clipping parameter $α$ that is optimized during training to find the right quantization scale. The weight quantization scheme, statistics-aware weight binning (SAWB), finds the optimal scaling factor that minimizes the quantization error based on the statistical characteristics of the distribution of weights without the need for an exhaustive search.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Adversarial Robustness in Machine Learning