Spectral Edge Dynamics Reveal Functional Modes of Learning
Yongzhong Xu

TL;DR
This paper shows that training dynamics in neural networks concentrate along a small set of spectral directions that reveal functional modes of learning, which are not captured by traditional interpretability tools.
Contribution
It introduces spectral edge analysis to uncover low-dimensional functional modes of learning, revealing how task symmetry influences these modes.
Findings
Spectral edges reliably distinguish grokking from non-grokking regimes.
Functional modes vary with task symmetry, with simple harmonic structures emerging in symmetric tasks.
Multitask training enhances the concentration of spectral modes, reflecting compositional structure.
Abstract
Training dynamics during grokking concentrate along a small number of dominant update directions -- the spectral edge -- which reliably distinguishes grokking from non-grokking regimes. We show that standard mechanistic interpretability tools (head attribution, activation probing, sparse autoencoders) fail to capture these directions: their structure is not localized in parameter or feature space. Instead, each direction induces a structured function over the input domain, revealing low-dimensional functional modes invisible to representation-level analysis. For modular addition, all leading directions collapse to a single Fourier mode. For multiplication, the same collapse appears only in the discrete-log basis, yielding a 5.9x improvement in concentration. For subtraction, the edge spans a small multi-mode family. For , no single harmonic basis suffices, but cross-terms of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
