The Geometric Wall: Manifold Structure Predicts Layerwise Sparse Autoencoder Scaling Laws
Eslam Zaher, Maciej Trzaskowski, Quan Nguyen, Fred Roosta

TL;DR
This paper reveals that the layerwise scaling laws of sparse autoencoders are governed by the geometric structure of the activation manifold, with curvature and intrinsic dimension predicting their performance limits.
Contribution
It introduces the first cross-layer SAE scaling study linking manifold geometry to layerwise scaling laws, demonstrating transferability across models.
Findings
Manifold geometry predicts layerwise width exponents in SAEs.
Geometric summaries explain variations in reconstruction error across layers.
Higher curvature and intrinsic dimension correlate with higher residual floors.
Abstract
Sparse autoencoders (SAEs) operationalise the linear representation hypothesis: they reconstruct model activations as sparse linear combinations of interpretable dictionary atoms, on the implicit assumption that activation space is well approximated by a globally linear structure. Their reconstruction error varies sharply across layers in ways that existing scaling laws, fitted at single layers, do not explain. We argue that this variation is the empirical trace of a geometric mismatch: where the activation manifold is curved and its intrinsic dimension varies across layers, no sparse linear dictionary can match it uniformly, and the SAE's width-sparsity scaling becomes a layer-dependent function of manifold structure rather than a single universal law. We conduct the first cross-layer SAE scaling study, fitting and regressing on 844 residual-stream Gemma Scope SAE checkpoints across 68…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
