Compression Scaling Laws:Unifying Sparsity and Quantization

Elias Frantar; Utku Evci; Wonpyo Park; Neil Houlsby; Dan Alistarh

arXiv:2502.16440·cs.LG·February 27, 2025

Compression Scaling Laws:Unifying Sparsity and Quantization

Elias Frantar, Utku Evci, Wonpyo Park, Neil Houlsby, Dan Alistarh

PDF

Open Access

TL;DR

This paper unifies the understanding of various compression techniques like sparsity and quantization in large language models through a common scaling law framework, revealing their effects on model efficiency during pretraining.

Contribution

It extends previous scaling law work to include quantization, showing how different compression methods can be compared and combined within a unified theoretical framework.

Findings

01

Weight sparsity acts as a constant multiplier on model size.

02

Weight-only quantization provides strong parameter efficiency.

03

Full quantization shows diminishing returns at lower bitwidths.

Abstract

We investigate how different compression techniques -- such as weight and activation quantization, and weight sparsity -- affect the scaling behavior of large language models (LLMs) during pretraining. Building on previous work showing that weight sparsity acts as a constant multiplier on model size in scaling laws, we demonstrate that this "effective parameter" scaling pattern extends to quantization as well. Specifically, we establish that weight-only quantization achieves strong parameter efficiency multipliers, while full quantization of both weights and activations shows diminishing returns at lower bitwidths. Our results suggest that different compression techniques can be unified under a common scaling law framework, enabling principled comparison and combination of these methods.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Speech Recognition and Synthesis · Topic Modeling