PTQ-SL: Exploring the Sub-layerwise Post-training Quantization
Zhihang Yuan, Yiqi Chen, Chenhao Xue, Chenguang Zhang, Qiankun Wang,, Guangyu Sun

TL;DR
This paper introduces PTQ-SL, a post-training quantization method using sub-layerwise granularity, and shows that channel reordering can enhance quantization performance, outperforming traditional channelwise methods.
Contribution
The paper explores a new sub-layerwise quantization granularity and proposes channel reordering to improve neural network quantization accuracy.
Findings
Sub-layerwise quantization accuracy correlates with granularity.
Channel reordering improves sub-layerwise quantization performance.
Sub-layerwise quantization can outperform channelwise methods.
Abstract
Network quantization is a powerful technique to compress convolutional neural networks. The quantization granularity determines how to share the scaling factors in weights, which affects the performance of network quantization. Most existing approaches share the scaling factors layerwisely or channelwisely for quantization of convolutional layers. Channelwise quantization and layerwise quantization have been widely used in various applications. However, other quantization granularities are rarely explored. In this paper, we will explore the sub-layerwise granularity that shares the scaling factor across multiple input and output channels. We propose an efficient post-training quantization method in sub-layerwise granularity (PTQ-SL). Then we systematically experiment on various granularities and observe that the prediction accuracy of the quantized neural network has a strong…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Image and Signal Denoising Methods · Generative Adversarial Networks and Image Synthesis
