Learned Step Size Quantization

Steven K. Esser; Jeffrey L. McKinstry; Deepika Bablani; Rathinakumar; Appuswamy; Dharmendra S. Modha

arXiv:1902.08153·cs.LG·May 8, 2020·296 cites

Learned Step Size Quantization

Steven K. Esser, Jeffrey L. McKinstry, Deepika Bablani, Rathinakumar, Appuswamy, Dharmendra S. Modha

PDF

Open Access 5 Repos

TL;DR

Learned Step Size Quantization is a training method that optimizes quantizer step sizes for low-precision deep networks, achieving state-of-the-art accuracy on ImageNet with 2-4 bit weights and activations.

Contribution

The paper introduces a novel gradient estimation technique for quantizer step sizes, enabling high-accuracy low-precision network training with minimal code changes.

Findings

01

Achieves highest accuracy on ImageNet with 2-4 bit quantization.

02

Enables training of 3-bit models matching full precision accuracy.

03

Provides a simple modification to existing training procedures.

Abstract

Deep networks run with low precision operations at inference time offer power and space advantages over high precision alternatives, but need to overcome the challenge of maintaining high accuracy as precision decreases. Here, we present a method for training such networks, Learned Step Size Quantization, that achieves the highest accuracy to date on the ImageNet dataset when using models, from a variety of architectures, with weights and activations quantized to 2-, 3- or 4-bits of precision, and that can train 3-bit models that reach full precision baseline accuracy. Our approach builds upon existing methods for learning weights in quantized networks by improving how the quantizer itself is configured. Specifically, we introduce a novel means to estimate and scale the task loss gradient at each weight and activation layer's quantizer step size, such that it can be learned in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning