Learned Step Size Quantization
Steven K. Esser, Jeffrey L. McKinstry, Deepika Bablani, Rathinakumar, Appuswamy, Dharmendra S. Modha

TL;DR
Learned Step Size Quantization is a training method that optimizes quantizer step sizes for low-precision deep networks, achieving state-of-the-art accuracy on ImageNet with 2-4 bit weights and activations.
Contribution
The paper introduces a novel gradient estimation technique for quantizer step sizes, enabling high-accuracy low-precision network training with minimal code changes.
Findings
Achieves highest accuracy on ImageNet with 2-4 bit quantization.
Enables training of 3-bit models matching full precision accuracy.
Provides a simple modification to existing training procedures.
Abstract
Deep networks run with low precision operations at inference time offer power and space advantages over high precision alternatives, but need to overcome the challenge of maintaining high accuracy as precision decreases. Here, we present a method for training such networks, Learned Step Size Quantization, that achieves the highest accuracy to date on the ImageNet dataset when using models, from a variety of architectures, with weights and activations quantized to 2-, 3- or 4-bits of precision, and that can train 3-bit models that reach full precision baseline accuracy. Our approach builds upon existing methods for learning weights in quantized networks by improving how the quantizer itself is configured. Specifically, we introduce a novel means to estimate and scale the task loss gradient at each weight and activation layer's quantizer step size, such that it can be learned in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning
