Rethinking Post-Training Quantization: Introducing a Statistical Pre-Calibration Approach
Alireza Ghaffari, Sharareh Younesian, Boxing Chen, Vahid Partovi Nia,, Masoud Asgharian

TL;DR
This paper introduces a statistical pre-calibration approach for post-training quantization of large language models, aiming to improve robustness and efficiency without relying solely on traditional calibration techniques.
Contribution
It proposes a weight-adaptive PTQ method that minimizes Kullback-Leibler divergence, serving as a pre-calibration step to enhance model quantization accuracy.
Findings
Achieves comparable accuracy to calibration-based PTQ methods
Preserves weight distribution and Shannon information content
Provides a robust, efficient deployment strategy for LLMs
Abstract
As Large Language Models (LLMs) become increasingly computationally complex, developing efficient deployment strategies, such as quantization, becomes crucial. State-of-the-art Post-training Quantization (PTQ) techniques often rely on calibration processes to maintain the accuracy of these models. However, while these calibration techniques can enhance performance in certain domains, they may not be as effective in others. This paper aims to draw attention to robust statistical approaches that can mitigate such issues. We propose a weight-adaptive PTQ method that can be considered a precursor to calibration-based PTQ methods, guiding the quantization process to preserve the distribution of weights by minimizing the Kullback-Leibler divergence between the quantized weights and the originally trained weights. This minimization ensures that the quantized model retains the Shannon…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference
MethodsSoftmax · Attention Is All You Need
