Post-training Quantization for Neural Networks with Provable Guarantees
Jinjie Zhang, Yixuan Zhou, Rayan Saab

TL;DR
This paper introduces a generalized post-training neural network quantization method, GPFQ, with provable error guarantees, demonstrating its effectiveness on various architectures and datasets with minimal accuracy loss.
Contribution
It extends GPFQ to handle general quantization alphabets and provides rigorous error analysis, including sparsity promotion and applicability to different architectures.
Findings
Quantized models show minor accuracy loss on ImageNet.
Error decays linearly with the number of weights in over-parameterized networks.
Modifications like bias correction improve quantization accuracy.
Abstract
While neural networks have been remarkably successful in a wide array of applications, implementing them in resource-constrained hardware remains an area of intense research. By replacing the weights of a neural network with quantized (e.g., 4-bit, or binary) counterparts, massive savings in computation cost, memory, and power consumption are attained. To that end, we generalize a post-training neural-network quantization method, GPFQ, that is based on a greedy path-following mechanism. Among other things, we propose modifications to promote sparsity of the weights, and rigorously analyze the associated error. Additionally, our error analysis expands the results of previous work on GPFQ to handle general quantization alphabets, showing that for quantizing a single-layer network, the relative square error essentially decays linearly in the number of weights -- i.e., level of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques
