Testing the spin-bath view of self-attention: A Hamiltonian analysis of GPT-2 Transformer

Satadeep Bhattacharjee; Seung-Cheol Lee

arXiv:2507.00683·cond-mat.mtrl-sci·January 1, 2026

Testing the spin-bath view of self-attention: A Hamiltonian analysis of GPT-2 Transformer

Satadeep Bhattacharjee, Seung-Cheol Lee

PDF

Open Access

TL;DR

This paper applies a physics-inspired spin-bath model to analyze GPT-2's attention mechanism, deriving Hamiltonians and phase boundaries that predict token selection, and empirically validating the model's relevance to language generation.

Contribution

It provides the first empirical validation of the spin-bath analogy in a large language model by deriving Hamiltonians and demonstrating causal effects through targeted ablations.

Findings

01

Strong negative correlation between theoretical logit gaps and empirical token rankings.

02

Ablation of spin-bath aligned heads shifts output probabilities as predicted.

03

Hamiltonian analysis offers a physics-grounded interpretability of attention mechanisms.

Abstract

The recently proposed physics-based framework by Huo and Johnson~\cite{huo2024capturing} models the attention mechanism of Large Language Models (LLMs) as an interacting two-body spin system, offering a first-principles explanation for phenomena like repetition and bias. Building on this hypothesis, we extract the complete Query-Key weight matrices from a production-grade GPT-2 model and derive the corresponding effective Hamiltonian for every attention head. From these Hamiltonians, we obtain analytic phase boundaries and logit gap criteria that predict which token should dominate the next-token distribution for a given context. A systematic evaluation on 144 heads across 20 factual-recall prompts reveals a strong negative correlation between the theoretical logit gaps and the model's empirical token rankings ( $r \approx - 0.70$ , $p < 1 0^{- 3}$ ).Targeted ablations further show that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEEG and Brain-Computer Interfaces