From Embeddings to Dyson Series: Transformer Mechanics as Non-Hermitian Operator Theory
Po-Hao Chang

TL;DR
This paper introduces an operator-theoretic framework for understanding Transformer architectures, connecting deep learning mechanics with many-body physics through non-Hermitian operator theory, providing new structural insights.
Contribution
It develops a novel operator-based perspective on Transformers, interpreting self-attention as a non-Hermitian interaction operator and deep network composition as regulated operator multiplication.
Findings
Deep properties like stability and saturation are explained as structural consequences of operator composition.
Multi-head attention and normalization are shown as natural structural outcomes, not just architectural choices.
The framework bridges deep learning and many-body physics, enabling cross-domain insights.
Abstract
Transformer architectures are typically described in algorithmic and statistical terms, leaving their internal mechanics without a familiar structural language for researchers trained in physical theories. To bridge this gap, we develop a complementary operator-theoretic framework that recasts their mechanics in a language familiar to many-body physics. Beginning from the token as a discrete index without intrinsic geometry, we show that embedding corresponds to a basis transformation into a continuous representation space. Once such a reference basis is established, self-attention naturally assumes the role of a non-Hermitian interaction operator, and network depth implements an ordered composition of these interactions. Within this formulation, several empirical properties of deep Transformers -- including stability at large depth, representational saturation, and the effectiveness of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModel Reduction and Neural Networks · Quantum many-body systems · Machine Learning in Materials Science
