Feature Repulsion and Spectral Lock-in: An Empirical Study of Two-Layer Network Grokking

Yongzhong Xu

arXiv:2605.08119·cs.LG·May 12, 2026

Feature Repulsion and Spectral Lock-in: An Empirical Study of Two-Layer Network Grokking

Yongzhong Xu

PDF

TL;DR

This empirical study investigates the feature repulsion mechanism in two-layer networks during grokking, revealing activation-dependent spectral signatures and confirming theoretical predictions about feature separation.

Contribution

It provides the first empirical validation of Tian's feature repulsion theorem and explores how activation functions influence spectral signatures during grokking.

Findings

01

Feature repulsion sign rule holds across multiple seeds and activation functions.

02

Spectral signatures in parameter updates depend critically on activation derivative.

03

Grokking correlates with a rank-2 spectrum in the parameter update matrix.

Abstract

Tian (2025) proves a repulsion theorem (Theorem 6) for the matrix $B = (F^{⊤} F + η I)^{- 1}$ during the interactive feature-learning stage of grokking: similar features have negative off-diagonal entries $B_{j ℓ}$ , producing an effective repulsive force that drives them apart. However, the theorem does not specify when this mechanism becomes empirically observable, nor whether it leaves a measurable spectral signature in the parameter updates. We test this directly on Tian's modular addition setup ( $M = 71$ , $K = 2048$ , MSE loss) and observe a clear structure-mechanism dissociation. The predicted sign rule holds robustly on the top-200 most-similar feature pairs across activations (empirical sign-match rising from 0.865 to 0.985 on $σ = x^{2}$ across 5 seeds, and saturating at 1.000 on $σ = ReLU$ ). However, the spectral…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.