Training-Induced Escape from Token Clustering in a Mean-Field Formulation of Transformers
Noboru Isobe, Daisuke Inoue, Masaaki Imaizumi

TL;DR
This paper develops a mean-field theory of Transformer training dynamics, revealing how training can cause token representations to escape initial clustering patterns driven by attention.
Contribution
It introduces a training-aware mean-field framework that models how training reshapes token clustering in Transformers, extending prior static analyses.
Findings
Training induces a phase where token clustering is broken near final layers.
Analysis uses entropy-regularized interaction energy to model attention bias.
Results suggest training and inference dynamics should be modeled together.
Abstract
Transformers perform inference by iteratively transforming token representations across layers. This layerwise computation has been studied empirically, and recent mean-field theories of Transformer dynamics explain how attention can drive token distributions toward clustering. However, existing mean-field analyses largely treat model parameters as prescribed, leaving open how training reshapes this clustering picture. We study this question in a noisy mean-field Transformer in which only a parameter-linear FFN is trained under regularization. We find and analyze a training-induced phase in the dynamics: after initially following attention-driven clustering, the token distribution can leave the clustered regime near the final layers. Our mathematical analysis is based on an entropy-regularized interaction energy that captures the clustering bias of attention. More broadly, our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
