Data-free mixed-precision quantization using novel sensitivity metric
Donghyun Lee, Minkyoung Cho, Seungwon Lee, Joonho Song, Changkyu, Choi

TL;DR
This paper introduces a new sensitivity metric for post-training mixed-precision quantization that improves accuracy and efficiency without requiring access to original training data.
Contribution
It proposes a novel sensitivity metric considering quantization error effects and layer interactions, along with data generation methods independent of specific network operations.
Findings
The new sensitivity metric better predicts quantization sensitivity.
Generated data are more effective for mixed-precision quantization.
The approach enhances neural network compression without original data.
Abstract
Post-training quantization is a representative technique for compressing neural networks, making them smaller and more efficient for deployment on edge devices. However, an inaccessible user dataset often makes it difficult to ensure the quality of the quantized neural network in practice. In addition, existing approaches may use a single uniform bit-width across the network, resulting in significant accuracy degradation at extremely low bit-widths. To utilize multiple bit-width, sensitivity metric plays a key role in balancing accuracy and compression. In this paper, we propose a novel sensitivity metric that considers the effect of quantization error on task loss and interaction with other layers. Moreover, we develop labeled data generation methods that are not dependent on a specific operation of the neural network. Our experiments show that the proposed metric better represents…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Image Enhancement Techniques · Advanced Image and Video Retrieval Techniques
