Clustering Head: A Visual Case Study of the Training Dynamics in Transformers
Ambroise Odonnat, Wassim Bouaziz, Vivien Cabannes

TL;DR
This paper presents a visual analysis of how transformers learn the sparse modular addition task, revealing a specific circuit type called "clustering heads" and detailing their training dynamics.
Contribution
It introduces a visual sandbox for analyzing transformer training and identifies the clustering heads circuit as key to learning invariants in the task.
Findings
Clustering heads circuits learn problem invariants.
Training exhibits two-stage learning and loss spikes.
Initialization and curriculum learning influence training dynamics.
Abstract
This paper introduces the sparse modular addition task and examines how transformers learn it. We focus on transformers with embeddings in and introduce a visual sandbox that provides comprehensive visualizations of each layer throughout the training process. We reveal a type of circuit, called "clustering heads," which learns the problem's invariants. We analyze the training dynamics of these circuits, highlighting two-stage learning, loss spikes due to high curvature or normalization layers, and the effects of initialization and curriculum learning.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
MethodsFocus
