AdpQ: A Zero-shot Calibration Free Adaptive Post Training Quantization Method for LLMs
Alireza Ghaffari, Sharareh Younesian, Vahid Partovi Nia, Boxing Chen,, Masoud Asgharian

TL;DR
AdpQ is a zero-shot, calibration-free post-training quantization method for LLMs that maintains accuracy, reduces quantization time, and enhances privacy by eliminating calibration data.
Contribution
The paper introduces AdpQ, a novel adaptive PTQ approach inspired by Adaptive LASSO, that achieves state-of-the-art low-precision quantization without calibration data.
Findings
Achieves same accuracy as existing methods on LLM benchmarks.
Reduces quantization time by at least 10x.
Eliminates need for calibration data, enhancing privacy.
Abstract
The ever-growing computational complexity of Large Language Models (LLMs) necessitates efficient deployment strategies. The current state-of-the-art approaches for Post-training Quantization (PTQ) often require calibration to achieve the desired accuracy. This paper presents AdpQ, a novel zero-shot adaptive PTQ method for LLMs that achieves the state-of-the-art performance in low-precision quantization (e.g. 3-bit) without requiring any calibration data. Inspired by Adaptive LASSO regression model, our proposed approach tackles the challenge of outlier activations by separating salient weights using an adaptive soft-thresholding method. Guided by Adaptive LASSO, this method ensures that the quantized weights distribution closely follows the originally trained weights and eliminates the need for calibration data entirely, setting our method apart from popular approaches such as SpQR and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMedical Imaging Techniques and Applications · Advanced Radiotherapy Techniques · Advanced MRI Techniques and Applications
