TL;DR
DiscQuant introduces a data-dependent rounding method based on discrepancy theory that significantly improves neural network quantization accuracy over existing methods, especially for low-bit representations.
Contribution
The paper presents a novel discrepancy theory-based rounding algorithm, DiscQuant, that enhances neural network quantization by minimizing approximation error with theoretical guarantees.
Findings
DiscQuant outperforms GPTQ and RTN in accuracy on benchmark datasets.
Theoretical analysis shows effective rounding with limited data samples and low-rank gradient spaces.
Empirical results demonstrate substantial accuracy improvements in quantized models.
Abstract
Quantizing the weights of a neural network has two steps: (1) Finding a good low bit-complexity representation for weights (which we call the quantization grid) and (2) Rounding the original weights to values in the quantization grid. In this paper, we study the problem of rounding optimally given any quantization grid. The simplest and most commonly used way to round is Round-to-Nearest (RTN). By rounding in a data-dependent way instead, one can improve the quality of the quantized model significantly. We study the rounding problem from the lens of \emph{discrepancy theory}, which studies how well we can round a continuous solution to a discrete solution without affecting solution quality too much. We prove that given samples from the data distribution, we can round all but model weights such that the expected approximation error of the quantized…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
