The Spectral Geometry of Thought: Phase Transitions, Instruction Reversal, Token-Level Dynamics, and Perfect Correctness Prediction in How Transformers Reason

Yi Liu

arXiv:2604.15350·cs.LG·April 20, 2026

The Spectral Geometry of Thought: Phase Transitions, Instruction Reversal, Token-Level Dynamics, and Perfect Correctness Prediction in How Transformers Reason

Yi Liu

PDF

TL;DR

This paper uncovers spectral phase transitions in transformer models' hidden states during reasoning versus recall, revealing universal geometric patterns, architecture-specific dynamics, and predictive markers of correctness.

Contribution

It introduces a spectral theory of reasoning in transformers, identifying core phenomena and establishing spectral analysis as a tool for understanding model thought processes.

Findings

01

Spectral compression occurs during reasoning in most models.

02

Instruction tuning reverses spectral relationships between reasoning and factual recall.

03

Spectral alpha can predict correctness with near-perfect accuracy.

Abstract

We discover that large language models exhibit \emph{spectral phase transitions} in their hidden activation spaces when engaging in reasoning versus factual recall. Through systematic spectral analysis across \textbf{11 models} spanning \textbf{5 architecture families} (Qwen, Pythia, Phi, Llama, DeepSeek-R1), we identify \textbf{seven} core phenomena: (1)~\textbf{Reasoning Spectral Compression} -- 9/11 models show significantly lower $α$ for reasoning ( $p < 0.05$ ), with larger effects in stronger models; (2)~\textbf{Instruction Tuning Spectral Reversal} -- base models show reasoning $α <$ factual $α$ , while instruction-tuned models reverse this relationship; (3)~\textbf{Architecture-Dependent Generation Taxonomy} -- prompt-to-response shifts partition into expansion, compression, and equilibrium regimes; (4)~\textbf{Spectral Scaling Law} -- $\alpha_\text{reasoning}…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.