PACT: Parameterized Clipping Activation for Quantized Neural Networks

Jungwook Choi; Zhuo Wang; Swagath Venkataramani; Pierce I-Jen Chuang,; Vijayalakshmi Srinivasan; Kailash Gopalakrishnan

arXiv:1805.06085·cs.CV·July 18, 2018·719 cites

PACT: Parameterized Clipping Activation for Quantized Neural Networks

Jungwook Choi, Zhuo Wang, Swagath Venkataramani, Pierce I-Jen Chuang,, Vijayalakshmi Srinivasan, Kailash Gopalakrishnan

PDF

Open Access 3 Repos

TL;DR

This paper introduces PACT, a novel activation quantization method that optimizes a clipping parameter during training, enabling neural networks to maintain high accuracy with ultra low precision weights and activations, thus improving efficiency.

Contribution

The paper presents PACT, a new activation quantization scheme that allows training neural networks with 4-bit weights and activations without significant accuracy loss.

Findings

01

Quantizing weights and activations to 4 bits maintains accuracy.

02

PACT outperforms existing quantization schemes.

03

Hardware implementation benefits include reduced area and improved inference speed.

Abstract

Deep learning algorithms achieve high classification accuracy at the expense of significant computation cost. To address this cost, a number of quantization schemes have been proposed - but most of these techniques focused on quantizing weights, which are relatively smaller in size compared to activations. This paper proposes a novel quantization scheme for activations during training - that enables neural networks to work well with ultra low precision weights and activations without any significant accuracy degradation. This technique, PArameterized Clipping acTivation (PACT), uses an activation clipping parameter $α$ that is optimized during training to find the right quantization scale. PACT allows quantizing activations to arbitrary bit precisions, while achieving much better accuracy relative to published state-of-the-art quantization schemes. We show, for the first time, that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsModel Reduction and Neural Networks · Neural Networks and Applications · Advanced Neural Network Applications