Evaluating the Generalization Ability of Quantized LLMs: Benchmark,   Analysis, and Toolbox

Yijun Liu; Yuan Meng; Fang Wu; Shenhao Peng; Hang Yao; Chaoyu Guan,; Chen Tang; Xinzhu Ma; Zhi Wang; Wenwu Zhu

arXiv:2406.12928·cs.LG·June 21, 2024

Evaluating the Generalization Ability of Quantized LLMs: Benchmark, Analysis, and Toolbox

Yijun Liu, Yuan Meng, Fang Wu, Shenhao Peng, Hang Yao, Chaoyu Guan,, Chen Tang, Xinzhu Ma, Zhi Wang, Wenwu Zhu

PDF

Open Access 1 Repo

TL;DR

This paper introduces a comprehensive benchmark, analysis, and toolbox to evaluate how quantization affects the generalization ability of large language models across various datasets and scenarios.

Contribution

It provides the first extensive benchmark and toolbox for assessing the impact of quantization on LLM generalization, including analysis of calibration data effects.

Findings

01

Calibration data distribution impacts quantized LLM performance.

02

Models with calibration data matching test distribution are not always optimal.

03

Extensive experiments reveal counter-intuitive insights about quantization effects.

Abstract

Large language models (LLMs) have exhibited exciting progress in multiple scenarios, while the huge computational demands hinder their deployments in lots of real-world applications. As an effective means to reduce memory footprint and inference cost, quantization also faces challenges in performance degradation at low bit-widths. Understanding the impact of quantization on LLM capabilities, especially the generalization ability, is crucial. However, the community's main focus remains on the algorithms and models of quantization, with insufficient attention given to whether the quantized models can retain the strong generalization abilities of LLMs. In this work, we fill this gap by providing a comprehensive benchmark suite for this research topic, including an evaluation system, detailed analyses, and a general toolbox. Specifically, based on the dominant pipeline in LLM quantization,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tsingmaoai/mi-optimize
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Mathematics, Computing, and Information Processing

MethodsSparse Evolutionary Training · Balanced Selection · Focus