Geometric Limits of Knowledge Distillation: A Minimum-Width Theorem via Superposition Theory

Nilesh Sarkar; Dawar Jyoti Deka

arXiv:2604.04037·cs.LG·April 8, 2026

Geometric Limits of Knowledge Distillation: A Minimum-Width Theorem via Superposition Theory

Nilesh Sarkar, Dawar Jyoti Deka

PDF

TL;DR

This paper reveals that the performance limit in knowledge distillation is fundamentally geometric, determined by the superposition capacity of neural networks, and provides a way to predict this limit using autoencoder measurements.

Contribution

It introduces a geometric minimum-width theorem for knowledge distillation based on superposition theory, linking feature capacity to network width and validating it empirically.

Findings

01

Performance saturates at a geometric loss floor related to feature superposition.

02

The loss floor can be predicted from autoencoder-measured feature capacity.

03

Coarse concepts survive even with significant feature loss, indicating the floor arises from fine-grained feature loss.

Abstract

Knowledge distillation compresses large teachers into smaller students, but performance saturates at a loss floor that persists across training methods and objectives. We argue this floor is geometric: neural networks represent far more features than dimensions through superposition, and a student of width $d_{S}$ can encode at most $d_{S} \cdot g (α)$ features, where $g (α) = 1/ ((1 - α) ln \frac{1}{1 - α})$ is a sparsity-dependent capacity function. Features beyond this budget are permanently lost, yielding an importance-weighted loss floor. We validate on a toy model (48 configurations, median accuracy >93%) and on Pythia-410M, where sparse autoencoders measure $F \approx 28, 700$ features at $α \approx 0.992$ (critical width $d_{S}^{*} \approx 1, 065$ ). Distillation into five student widths confirms the predicted monotonic floor ordering. The observed floor decomposes…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.