PACT: Parameterized Clipping Activation for Quantized Neural Networks
Jungwook Choi, Zhuo Wang, Swagath Venkataramani, Pierce I-Jen Chuang,, Vijayalakshmi Srinivasan, Kailash Gopalakrishnan

TL;DR
This paper introduces PACT, a novel activation quantization method that optimizes a clipping parameter during training, enabling neural networks to maintain high accuracy with ultra low precision weights and activations, thus improving efficiency.
Contribution
The paper presents PACT, a new activation quantization scheme that allows training neural networks with 4-bit weights and activations without significant accuracy loss.
Findings
Quantizing weights and activations to 4 bits maintains accuracy.
PACT outperforms existing quantization schemes.
Hardware implementation benefits include reduced area and improved inference speed.
Abstract
Deep learning algorithms achieve high classification accuracy at the expense of significant computation cost. To address this cost, a number of quantization schemes have been proposed - but most of these techniques focused on quantizing weights, which are relatively smaller in size compared to activations. This paper proposes a novel quantization scheme for activations during training - that enables neural networks to work well with ultra low precision weights and activations without any significant accuracy degradation. This technique, PArameterized Clipping acTivation (PACT), uses an activation clipping parameter that is optimized during training to find the right quantization scale. PACT allows quantizing activations to arbitrary bit precisions, while achieving much better accuracy relative to published state-of-the-art quantization schemes. We show, for the first time, that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModel Reduction and Neural Networks · Neural Networks and Applications · Advanced Neural Network Applications
