Mixed-Precision Quantized Neural Network with Progressively Decreasing   Bitwidth For Image Classification and Object Detection

Tianshu Chu; Qin Luo; Jie Yang; Xiaolin Huang

arXiv:1912.12656·cs.CV·January 1, 2020·5 cites

Mixed-Precision Quantized Neural Network with Progressively Decreasing Bitwidth For Image Classification and Object Detection

Tianshu Chu, Qin Luo, Jie Yang, Xiaolin Huang

PDF

Open Access

TL;DR

This paper introduces a mixed-precision quantization method with decreasing bitwidth across layers, enhancing accuracy and compression in neural networks for image classification and object detection.

Contribution

It proposes a novel layered bitwidth reduction strategy based on feature distribution analysis, improving the accuracy-compression trade-off in quantized neural networks.

Findings

01

Achieves over 30% memory reduction compared to homogeneous quantization.

02

Higher-precision bottom layers improve 1-bit network performance.

03

Lower-precision posterior layers aid regularization.

Abstract

Efficient model inference is an important and practical issue in the deployment of deep neural network on resource constraint platforms. Network quantization addresses this problem effectively by leveraging low-bit representation and arithmetic that could be conducted on dedicated embedded systems. In the previous works, the parameter bitwidth is set homogeneously and there is a trade-off between superior performance and aggressive compression. Actually the stacked network layers, which are generally regarded as hierarchical feature extractors, contribute diversely to the overall performance. For a well-trained neural network, the feature distributions of different categories differentiate gradually as the network propagates forward. Hence the capability requirement on the subsequent feature extractors is reduced. It indicates that the neurons in posterior layers could be assigned with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Machine Learning and ELM