Discovering Low-Precision Networks Close to Full-Precision Networks for Efficient Embedded Inference
Jeffrey L. McKinstry, Steven K. Esser, Rathinakumar Appuswamy, Deepika, Bablani, John V. Arthur, Izzet B. Yildiz, Dharmendra S. Modha

TL;DR
This paper demonstrates that low-precision neural networks, especially 4-bit models, can match or exceed the accuracy of full-precision networks on ImageNet, enabling more energy-efficient embedded inference.
Contribution
It shows that 8-bit and 4-bit networks can achieve near or better accuracy than full-precision models using simple fine-tuning techniques and pretrained models.
Findings
4-bit networks match full-precision accuracy on ImageNet
Low-precision weights are very similar to full-precision weights
Pretrained models and calibration enable discovering low-precision networks
Abstract
To realize the promise of ubiquitous embedded deep network inference, it is essential to seek limits of energy and area efficiency. To this end, low-precision networks offer tremendous promise because both energy and area scale down quadratically with the reduction in precision. Here we demonstrate ResNet-18, -34, -50, -152, Inception-v3, Densenet-161, and VGG-16bn networks on the ImageNet classification benchmark that, at 8-bit precision exceed the accuracy of the full-precision baseline networks after one epoch of finetuning, thereby leveraging the availability of pretrained models. We also demonstrate ResNet-18, -34, -50, -152, Densenet-161, and VGG-16bn 4-bit models that match the accuracy of the full-precision baseline networks -- the highest scores to date. Surprisingly, the weights of the low-precision networks are very close (in cosine similarity) to the weights of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Adversarial Robustness in Machine Learning
MethodsRMSProp · Convolution · Average Pooling · Auxiliary Classifier · 1x1 Convolution · Inception-v3 Module · Max Pooling · Softmax · Dropout · Dense Connections
