Cluster-Promoting Quantization with Bit-Drop for Minimizing Network Quantization Loss
Jung Hyun Lee, Jihun Yun, Sung Ju Hwang, Eunho Yang

TL;DR
This paper introduces Cluster-Promoting Quantization (CPQ), a novel neural network quantization method that optimizes quantization grids and encourages weights to cluster around them, reducing quantization errors and improving performance.
Contribution
The work proposes a differentiable quantization approach with multi-class STE and DropBits, enabling learning of optimal and heterogeneous quantization levels during training.
Findings
CPQ outperforms fixed-level quantization on benchmarks.
Learning heterogeneous levels yields better accuracy.
DropBits effectively regularizes bit allocation across layers.
Abstract
Network quantization, which aims to reduce the bit-lengths of the network weights and activations, has emerged for their deployments to resource-limited devices. Although recent studies have successfully discretized a full-precision network, they still incur large quantization errors after training, thus giving rise to a significant performance gap between a full-precision network and its quantized counterpart. In this work, we propose a novel quantization method for neural networks, Cluster-Promoting Quantization (CPQ) that finds the optimal quantization grids while naturally encouraging the underlying full-precision weights to gather around those quantization grids cohesively during training. This property of CPQ is thanks to our two main ingredients that enable differentiable quantization: i) the use of the categorical distribution designed by a specific probabilistic parametrization…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Brain Tumor Detection and Classification
MethodsDropout
