Do Emergent Abilities Exist in Quantized Large Language Models: An Empirical Study
Peiyu Liu, Zikang Liu, Ze-Feng Gao, Dawei Gao, Wayne Xin Zhao, Yaliang, Li, Bolin Ding, Ji-Rong Wen

TL;DR
This study investigates how quantization affects the emergent abilities of large language models, finding that 4-bit models retain these abilities while 2-bit models suffer significant performance loss, with potential for further optimization.
Contribution
It provides the first detailed empirical analysis of how different levels of quantization impact emergent capabilities in LLMs, including insights for improving low-bit models.
Findings
4-bit quantized models retain emergent abilities
2-bit quantized models show severe performance degradation
Fine-tuning can partially recover performance in low-bit models
Abstract
Despite the superior performance, Large Language Models~(LLMs) require significant computational resources for deployment and use. To overcome this issue, quantization methods have been widely applied to reduce the memory footprint of LLMs as well as increasing the inference rate. However, a major challenge is that low-bit quantization methods often lead to performance degradation. It is important to understand how quantization impacts the capacity of LLMs. Different from previous studies focused on overall performance, this work aims to investigate the impact of quantization on \emph{emergent abilities}, which are important characteristics that distinguish LLMs from small language models. Specially, we examine the abilities of in-context learning, chain-of-thought reasoning, and instruction-following in quantized LLMs. Our empirical experiments show that these emergent abilities still…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Ferroelectric and Negative Capacitance Devices
