Quantization and Training of Neural Networks for Efficient   Integer-Arithmetic-Only Inference

Benoit Jacob; Skirmantas Kligys; Bo Chen; Menglong Zhu; Matthew Tang,; Andrew Howard; Hartwig Adam; Dmitry Kalenichenko

arXiv:1712.05877·cs.LG·December 19, 2017

Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference

Benoit Jacob, Skirmantas Kligys, Bo Chen, Menglong Zhu, Matthew Tang,, Andrew Howard, Hartwig Adam, Dmitry Kalenichenko

PDF

5 Repos 3 Models

TL;DR

This paper introduces an integer-only quantization scheme and a co-designed training method that enable efficient, accurate deep learning inference on mobile devices with integer hardware, improving latency and accuracy tradeoffs.

Contribution

It presents a novel quantization scheme and training procedure that together enable end-to-end integer-only inference with minimal accuracy loss.

Findings

01

Significant accuracy improvements on MobileNets after quantization.

02

Enhanced on-device inference speed and efficiency.

03

Validated on ImageNet and COCO datasets with CPU hardware.

Abstract

The rising popularity of intelligent mobile devices and the daunting computational cost of deep learning-based models call for efficient and accurate on-device inference schemes. We propose a quantization scheme that allows inference to be carried out using integer-only arithmetic, which can be implemented more efficiently than floating point inference on commonly available integer-only hardware. We also co-design a training procedure to preserve end-to-end model accuracy post quantization. As a result, the proposed quantization scheme improves the tradeoff between accuracy and on-device latency. The improvements are significant even on MobileNets, a model family known for run-time efficiency, and are demonstrated in ImageNet classification and COCO detection on popular CPUs.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Methods24/7 QuickBooks Enterprise Support Number | Fast Solutions 💼 · How Can I Contact QuickBooks Premier Support Help Team?