Post-training Quantization with Multiple Points: Mixed Precision without Mixed Precision
Xingchao Liu, Mao Ye, Dengyong Zhou, Qiang Liu

TL;DR
This paper introduces multipoint post-training quantization, which approximates neural network weights using multiple low-bit vectors to achieve mixed precision effects without hardware changes, improving accuracy on vision tasks.
Contribution
The paper proposes a novel multipoint quantization method that adaptively combines multiple low-bit vectors for each weight, enhancing precision without hardware modifications.
Findings
Outperforms state-of-the-art quantization methods on ImageNet classification.
Achieves higher accuracy in object detection tasks like PASCAL VOC.
Requires minimal additional memory and computation overhead.
Abstract
We consider the post-training quantization problem, which discretizes the weights of pre-trained deep neural networks without re-training the model. We propose multipoint quantization, a quantization method that approximates a full-precision weight vector using a linear combination of multiple vectors of low-bit numbers; this is in contrast to typical quantization methods that approximate each weight using a single low precision number. Computationally, we construct the multipoint quantization with an efficient greedy selection procedure, and adaptively decides the number of low precision points on each quantized weight vector based on the error of its output. This allows us to achieve higher precision levels for important weights that greatly influence the outputs, yielding an 'effect of mixed precision' but without physical mixed precision implementations (which requires specialized…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning
