Latent Algorithmic Structure Precedes Grokking: A Mechanistic Study of ReLU MLPs on Modular Arithmetic
Anand Swaroop

TL;DR
This paper investigates the internal mechanisms of ReLU MLPs during grokking, revealing that they develop near-binary square wave input weights and phase-aligned output weights, which are key to their eventual generalization.
Contribution
It uncovers the latent algorithmic structure in ReLU MLPs that precedes grokking, showing how weights evolve into binary and phase-aligned forms, and introduces an idealized model capturing this behavior.
Findings
ReLU MLPs learn near-binary square wave input weights.
Output weights exhibit phase-sum relations satisfying + .
Idealized model achieves high accuracy using extracted Fourier components.
Abstract
Grokking-the phenomenon where validation accuracy of neural networks on modular addition of two integers rises long after training data has been memorized-has been characterized in previous works as producing sinusoidal input weight distributions in transformers and multi-layer perceptrons (MLPs). We find empirically that ReLU MLPs in our experimental setting instead learn near-binary square wave input weights, where intermediate-valued weights appear exclusively near sign-change boundaries, alongside output weight distributions whose dominant Fourier phases satisfy a phase-sum relation ; this relation holds even when the model is trained on noisy data and fails to grok. We extract the frequency and phase of each neuron's weights via DFT and construct an idealized MLP: Input weights are replaced by perfect binary square waves and output weights by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModel Reduction and Neural Networks · Neural Networks and Applications · Neural Networks and Reservoir Computing
