CSMPQ:Class Separability Based Mixed-Precision Quantization
Mingkai Wang, Taisong Jin, Miaohui Zhang, Zhengtao Yu

TL;DR
CSMPQ introduces a class separability metric based on TF-IDF to efficiently determine optimal mixed-precision quantization configurations, improving compression and inference speed without iterative training.
Contribution
The paper proposes a novel class separability measure using TF-IDF for mixed-precision quantization, avoiding iterative search and achieving better compression-performance trade-offs.
Findings
Achieves 73.03% Top-1 accuracy on ResNet-18 with 59G BOPs.
Attains 71.30% Top-1 accuracy on MobileNetV2 with 1.5MB.
Outperforms state-of-the-art quantization methods in efficiency and accuracy.
Abstract
Mixed-precision quantization has received increasing attention for its capability of reducing the computational burden and speeding up the inference time. Existing methods usually focus on the sensitivity of different network layers, which requires a time-consuming search or training process. To this end, a novel mixed-precision quantization method, termed CSMPQ, is proposed. Specifically, the TF-IDF metric that is widely used in natural language processing (NLP) is introduced to measure the class separability of layer-wise feature maps. Furthermore, a linear programming problem is designed to derive the optimal bit configuration for each layer. Without any iterative process, the proposed CSMPQ achieves better compression trade-offs than the state-of-the-art quantization methods. Specifically, CSMPQ achieves 73.03 Top-1 acc on ResNet-18 with only 59G BOPs for QAT, and 71.30…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Brain Tumor Detection and Classification · Advanced Chemical Sensor Technologies
MethodsPointwise Convolution · Batch Normalization · Depthwise Convolution · 1x1 Convolution · Depthwise Separable Convolution · Convolution · Inverted Residual Block · Average Pooling
