Evaluating the Generalization Ability of Quantized LLMs: Benchmark, Analysis, and Toolbox
Yijun Liu, Yuan Meng, Fang Wu, Shenhao Peng, Hang Yao, Chaoyu Guan,, Chen Tang, Xinzhu Ma, Zhi Wang, Wenwu Zhu

TL;DR
This paper introduces a comprehensive benchmark, analysis, and toolbox to evaluate how quantization affects the generalization ability of large language models across various datasets and scenarios.
Contribution
It provides the first extensive benchmark and toolbox for assessing the impact of quantization on LLM generalization, including analysis of calibration data effects.
Findings
Calibration data distribution impacts quantized LLM performance.
Models with calibration data matching test distribution are not always optimal.
Extensive experiments reveal counter-intuitive insights about quantization effects.
Abstract
Large language models (LLMs) have exhibited exciting progress in multiple scenarios, while the huge computational demands hinder their deployments in lots of real-world applications. As an effective means to reduce memory footprint and inference cost, quantization also faces challenges in performance degradation at low bit-widths. Understanding the impact of quantization on LLM capabilities, especially the generalization ability, is crucial. However, the community's main focus remains on the algorithms and models of quantization, with insufficient attention given to whether the quantized models can retain the strong generalization abilities of LLMs. In this work, we fill this gap by providing a comprehensive benchmark suite for this research topic, including an evaluation system, detailed analyses, and a general toolbox. Specifically, based on the dominant pipeline in LLM quantization,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Mathematics, Computing, and Information Processing
MethodsSparse Evolutionary Training · Balanced Selection · Focus
