The Lottery LLM Hypothesis, Rethinking What Abilities Should LLM   Compression Preserve?

Zhenheng Tang; Xiang Liu; Qian Wang; Peijie Dong; Bingsheng He,; Xiaowen Chu; Bo Li

arXiv:2502.17535·cs.LG·February 26, 2025

The Lottery LLM Hypothesis, Rethinking What Abilities Should LLM Compression Preserve?

Zhenheng Tang, Xiang Liu, Qian Wang, Peijie Dong, Bingsheng He,, Xiaowen Chu, Bo Li

PDF

Open Access

TL;DR

This paper proposes the lottery LLM hypothesis, suggesting smaller models can match larger LLMs' performance with the right capabilities and external tools, challenging current compression focus on performance preservation.

Contribution

It introduces the lottery LLM hypothesis and highlights overlooked capabilities essential for effective LLM compression and performance retention.

Findings

01

Lottery LLMs can achieve comparable performance with fewer parameters.

02

External tools and multi-step reasoning are crucial for effective model compression.

03

Current compression methods overlook key capabilities needed for performance.

Abstract

Motivated by reducing the computational and storage costs of LLMs, model compression and KV cache compression have attracted much attention from researchers. However, current methods predominantly emphasize maintaining the performance of compressed LLMs, as measured by perplexity or simple accuracy on tasks of common sense knowledge QA and basic arithmetic reasoning. In this blog, we present a brief review of recent advancements in LLMs related to retrieval-augmented generation, multi-step reasoning, external tools, and computational expressivity, all of which substantially enhance LLM performance. Then, we propose a lottery LLM hypothesis suggesting that for a given LLM and task, there exists a smaller lottery LLM capable of producing the same performance as the original LLM with the assistance of multi-step reasoning and external tools. Based on the review of current progress in LLMs,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques

MethodsSoftmax · Attention Is All You Need