On the Complexity of Neural Computation in Superposition
Micah Adler, Nir Shavit

TL;DR
This paper establishes theoretical complexity bounds for neural networks computing in superposition, revealing fundamental limits and capabilities that influence model sparsification, capacity, and feature representation.
Contribution
It provides the first lower bounds for superposition computation in neural networks, showing the relationship between neurons, parameters, and features, and offers nearly tight upper bounds for logical operations.
Findings
Neural networks require at least (\u221a{m' m'}) neurons for superposition features.
Networks can compute logical operations with ( m') neurons and parameters.
Capacity of networks is bounded by (n^2 / n) features, showing a subexponential limit.
Abstract
Superposition, the ability of neural networks to represent more features than neurons, is increasingly seen as key to the efficiency of large models. This paper investigates the theoretical foundations of computing in superposition, establishing complexity bounds for explicit, provably correct algorithms. We present the first lower bounds for a neural network computing in superposition, showing that for a broad class of problems, including permutations and pairwise logical operations, computing features in superposition requires at least neurons and parameters. This implies an explicit limit on how much one can sparsify or distill a model while preserving its expressibility, and complements empirical scaling laws by implying the first subexponential bound on capacity: a network with neurons can compute at most …
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputability, Logic, AI Algorithms · Neural Networks and Applications · semigroups and automata theory
