How Many Features Can a Language Model Store Under the Linear Representation Hypothesis?

Nikhil Garg; Jon Kleinberg; Kenny Peng

arXiv:2602.11246·cs.LG·February 13, 2026

How Many Features Can a Language Model Store Under the Linear Representation Hypothesis?

Nikhil Garg, Jon Kleinberg, Kenny Peng

PDF

Open Access

TL;DR

This paper develops a mathematical framework to analyze how many features a language model can store and access linearly, providing bounds that support the superposition hypothesis in neural representations.

Contribution

It establishes nearly-matching bounds for linear compressed sensing in neural features, highlighting the strength of linear accessibility over mere linear representation.

Findings

01

Neurons can store exponentially many features under the LRH.

02

Linear accessibility is a stronger condition than linear representation.

03

Theoretical bounds differ significantly from classical compressed sensing results.

Abstract

We introduce a mathematical framework for the linear representation hypothesis (LRH), which asserts that intermediate layers of language models store features linearly. We separate the hypothesis into two claims: linear representation (features are linearly embedded in neuron activations) and linear accessibility (features can be linearly decoded). We then ask: How many neurons $d$ suffice to both linearly represent and linearly access $m$ features? Classical results in compressed sensing imply that for $k$ -sparse inputs, $d = O (k lo g (m / k))$ suffices if we allow non-linear decoding algorithms (Candes and Tao, 2006; Candes et al., 2006; Donoho, 2006). However, the additional requirement of linear decoding takes the problem out of the classical compressed sensing, into linear compressed sensing. Our main theoretical result establishes nearly-matching upper and lower bounds for linear…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Stochastic Gradient Optimization Techniques · Ferroelectric and Negative Capacitance Devices