Layerwise Dynamics for In-Context Classification in Transformers
Patrick Lutz, Themistoklis Haris, Arjun Chandra, Aditya Gangrade, Venkatesh Saligrama

TL;DR
This paper uncovers the internal dynamics of transformers performing in-context classification, revealing an explicit recursive update rule that enhances class separation and interpretability.
Contribution
It introduces a novel identifiable model enforcing permutation equivariance, leading to the first explicit recursion inside a softmax transformer for classification.
Findings
Derives an explicit depth-indexed recursion for transformer dynamics.
Shows that attention matrices drive coupled updates of data and labels.
Proves that the dynamics amplify class separation and improve class alignment.
Abstract
Transformers can perform in-context classification from a few labeled examples, yet the inference-time algorithm remains opaque. We study multi-class linear classification in the hard no-margin regime and make the computation identifiable by enforcing feature- and label-permutation equivariance at every layer. This enables interpretability while maintaining functional equivalence and yields highly structured weights. From these models we extract an explicit depth-indexed recursion: an end-to-end identified, emergent update rule inside a softmax transformer, to our knowledge the first of its kind. Attention matrices formed from mixed feature-label Gram structure drive coupled updates of training points, labels, and the test probe. The resulting dynamics implement a geometry-driven algorithmic motif, which can provably amplify class separation and yields robust expected class alignment.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
