Compression Represents Intelligence Linearly

Yuzhen Huang; Jinghan Zhang; Zifei Shan; Junxian He

arXiv:2404.09937·cs.CL·August 20, 2024·2 cites

Compression Represents Intelligence Linearly

Yuzhen Huang, Jinghan Zhang, Zifei Shan, Junxian He

PDF

Open Access 1 Repo 1 Datasets

TL;DR

This study empirically demonstrates that large language models' intelligence, measured by benchmark scores, correlates almost linearly with their ability to compress text, supporting the idea that compression underpins intelligence.

Contribution

The paper provides the first large-scale empirical evidence linking LLMs' compression ability with their intelligence across multiple benchmarks.

Findings

01

LLMs' benchmark scores almost linearly correlate with their text compression ability.

02

Compression efficiency can serve as a reliable, unsupervised metric for evaluating model capabilities.

03

Open-source datasets and pipelines for assessing compression are provided.

Abstract

There is a belief that learning to compress well will lead to intelligence. Recently, language modeling has been shown to be equivalent to compression, which offers a compelling rationale for the success of large language models (LLMs): the development of more advanced language models is essentially enhancing compression which facilitates intelligence. Despite such appealing discussions, little empirical evidence is present for the interplay between compression and intelligence. In this work, we examine their relationship in the context of LLMs, treating LLMs as data compressors. Given the abstract concept of "intelligence", we adopt the average downstream benchmark scores as a surrogate, specifically targeting intelligence related to knowledge and commonsense, coding, and mathematical reasoning. Across 12 benchmarks, our study brings together 31 public LLMs that originate from diverse…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hkust-nlp/llm-compression-intelligence
noneOfficial

Datasets

hkust-nlp/llm-compression
dataset· 63 dl
63 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComputability, Logic, AI Algorithms