Grokking of Diffusion Models: Case Study on Modular Addition
Joon Hyeok Kim, Yong-Hyun Park, Mattis Dals{\ae}tra {\O}stby, Jiatao Gu

TL;DR
This paper investigates how diffusion models generalize and perform modular addition, revealing their internal mechanisms and the process of delayed generalization known as grokking.
Contribution
It provides a mechanistic analysis of diffusion models' internal computations during modular addition, highlighting how they bridge symbolic reasoning and pixel-space generation.
Findings
Models implement modular addition via compositional representations.
Iterative sampling separates arithmetic computation from denoising.
Grokking occurs with delayed generalization after overfitting.
Abstract
Despite their empirical success, how diffusion models generalize remains poorly understood from a mechanistic perspective. We demonstrate that diffusion models trained with flow-matching objectives exhibit grokking--delayed generalization after overfitting--on modular addition, enabling controlled analysis of their internal computations. We study this phenomenon across two levels of data regime. In a single-image regime, mechanistic dissection reveals that the model implements modular addition by composing periodic representations of individual operands. In a diverse-image regime with high intraclass variability, we find that the model leverages its iterative sampling process to partition the task into an arithmetic computation phase followed by a visual denoising phase, separated by a critical timestep threshold. Our work provides the mechanistic decomposition of algorithmic learning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
