Discovering Low-Precision Networks Close to Full-Precision Networks for   Efficient Embedded Inference

Jeffrey L. McKinstry; Steven K. Esser; Rathinakumar Appuswamy; Deepika; Bablani; John V. Arthur; Izzet B. Yildiz; Dharmendra S. Modha

arXiv:1809.04191·cs.CV·February 26, 2019·46 cites

Discovering Low-Precision Networks Close to Full-Precision Networks for Efficient Embedded Inference

Jeffrey L. McKinstry, Steven K. Esser, Rathinakumar Appuswamy, Deepika, Bablani, John V. Arthur, Izzet B. Yildiz, Dharmendra S. Modha

PDF

Open Access

TL;DR

This paper demonstrates that low-precision neural networks, especially 4-bit models, can match or exceed the accuracy of full-precision networks on ImageNet, enabling more energy-efficient embedded inference.

Contribution

It shows that 8-bit and 4-bit networks can achieve near or better accuracy than full-precision models using simple fine-tuning techniques and pretrained models.

Findings

01

4-bit networks match full-precision accuracy on ImageNet

02

Low-precision weights are very similar to full-precision weights

03

Pretrained models and calibration enable discovering low-precision networks

Abstract

To realize the promise of ubiquitous embedded deep network inference, it is essential to seek limits of energy and area efficiency. To this end, low-precision networks offer tremendous promise because both energy and area scale down quadratically with the reduction in precision. Here we demonstrate ResNet-18, -34, -50, -152, Inception-v3, Densenet-161, and VGG-16bn networks on the ImageNet classification benchmark that, at 8-bit precision exceed the accuracy of the full-precision baseline networks after one epoch of finetuning, thereby leveraging the availability of pretrained models. We also demonstrate ResNet-18, -34, -50, -152, Densenet-161, and VGG-16bn 4-bit models that match the accuracy of the full-precision baseline networks -- the highest scores to date. Surprisingly, the weights of the low-precision networks are very close (in cosine similarity) to the weights of the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Adversarial Robustness in Machine Learning

MethodsRMSProp · Convolution · Average Pooling · Auxiliary Classifier · 1x1 Convolution · Inception-v3 Module · Max Pooling · Softmax · Dropout · Dense Connections