CLAQ: Pushing the Limits of Low-Bit Post-Training Quantization for LLMs
Haoyu Wang, Bei Liu, Hang Shao, Bo Xiao, Ke Zeng, Guanglu Wan, Yanmin, Qian

TL;DR
This paper introduces CLAQ, a novel column-level adaptive quantization framework for LLMs that significantly improves low-bit quantization performance by dynamically adjusting precision and reserving outliers, achieving state-of-the-art results.
Contribution
The paper proposes a new adaptive quantization method with centroid generation, bit-width adjustment, and outlier reservation, enhancing low-bit LLM quantization performance.
Findings
Achieves state-of-the-art results on LLaMA-1, LLaMA-2, and Yi models.
Excels particularly in extremely low-bit scenarios.
Demonstrates significant memory and efficiency improvements.
Abstract
Parameter quantization for Large Language Models (LLMs) has attracted increasing attentions recently in reducing memory costs and improving computational efficiency. Early approaches have been widely adopted. However, the existing methods suffer from poor performance in low-bit (such as 2 to 3 bits) scenarios. In this paper, we present a novel and effective Column-Level Adaptive weight Quantization (CLAQ) framework by introducing three different types of adaptive strategies for LLM quantization. Firstly, a K-Means clustering based algorithm is proposed that allows dynamic generation of quantization centroids for each column of a parameter matrix. Secondly, we design an outlier-guided adaptive precision search strategy which can dynamically assign varying bit-widths to different columns. Finally, a dynamic outlier reservation scheme is developed to retain some parameters in their…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvancements in Photolithography Techniques · Advancements in Semiconductor Devices and Circuit Design · Advanced Data Storage Technologies
Methodsk-Means Clustering
