Grokking From Abstraction to Intelligence

Junjie Zhang; Zhen Shen; Gang Xiong; Xisong Dong

arXiv:2603.29262·cs.AI·April 1, 2026

Grokking From Abstraction to Intelligence

Junjie Zhang, Zhen Shen, Gang Xiong, Xisong Dong

PDF

TL;DR

This paper investigates the mechanistic origins of grokking in modular arithmetic models, revealing that a spontaneous internal simplification driven by parsimony underpins the transition from memorization to generalization.

Contribution

It introduces a novel perspective linking model simplification, information compression, and manifold collapse to explain grokking, integrating multiple complexity measures and learning theory.

Findings

01

Grokking involves a collapse of redundant manifolds.

02

Model generalization correlates with deep information compression.

03

Spontaneous simplification driven by parsimony underpins grokking.

Abstract

Grokking in modular arithmetic has established itself as the quintessential fruit fly experiment, serving as a critical domain for investigating the mechanistic origins of model generalization. Despite its significance, existing research remains narrowly focused on specific local circuits or optimization tuning, largely overlooking the global structural evolution that fundamentally drives this phenomenon. We propose that grokking originates from a spontaneous simplification of internal model structures governed by the principle of parsimony. We integrate causal, spectral, and algorithmic complexity measures alongside Singular Learning Theory to reveal that the transition from memorization to generalization corresponds to the physical collapse of redundant manifolds and deep information compression, offering a novel perspective for understanding the mechanisms of model overfitting and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.