Mixed-Precision Neural Network Quantization via Learned Layer-wise   Importance

Chen Tang; Kai Ouyang; Zhi Wang; Yifei Zhu; Yaowei Wang; and Wen Ji; Wenwu Zhu

arXiv:2203.08368·cs.LG·March 7, 2023·5 cites

Mixed-Precision Neural Network Quantization via Learned Layer-wise Importance

Chen Tang, Kai Ouyang, Zhi Wang, Yifei Zhu, Yaowei Wang, and Wen Ji, Wenwu Zhu

PDF

Open Access 1 Repo

TL;DR

This paper introduces a fast, importance-based method for mixed-precision neural network quantization that significantly reduces search time and achieves state-of-the-art accuracy on ImageNet.

Contribution

It proposes a joint training scheme to learn layer importance indicators and formulates the quantization search as a single ILP problem, greatly improving efficiency.

Findings

01

Achieves SOTA accuracy on ImageNet with various constraints.

02

Reduces quantization search time from hours to milliseconds.

03

Effectively guides mixed-precision quantization using learned importance indicators.

Abstract

The exponentially large discrete search space in mixed-precision quantization (MPQ) makes it hard to determine the optimal bit-width for each layer. Previous works usually resort to iterative search methods on the training set, which consume hundreds or even thousands of GPU-hours. In this study, we reveal that some unique learnable parameters in quantization, namely the scale factors in the quantizer, can serve as importance indicators of a layer, reflecting the contribution of that layer to the final accuracy at certain bit-widths. These importance indicators naturally perceive the numerical transformation during quantization-aware training, which can precisely provide quantization sensitivity metrics of layers. However, a deep network always contains hundreds of such indicators, and training them one by one would lead to an excessive time cost. To overcome this issue, we propose a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

1hunters/LIMPQ
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Advanced Neural Network Applications · Advanced Vision and Imaging