From Embeddings to Dyson Series: Transformer Mechanics as Non-Hermitian Operator Theory

Po-Hao Chang

arXiv:2603.11322·cond-mat.dis-nn·March 18, 2026

From Embeddings to Dyson Series: Transformer Mechanics as Non-Hermitian Operator Theory

Po-Hao Chang

PDF

Open Access

TL;DR

This paper introduces an operator-theoretic framework for understanding Transformer architectures, connecting deep learning mechanics with many-body physics through non-Hermitian operator theory, providing new structural insights.

Contribution

It develops a novel operator-based perspective on Transformers, interpreting self-attention as a non-Hermitian interaction operator and deep network composition as regulated operator multiplication.

Findings

01

Deep properties like stability and saturation are explained as structural consequences of operator composition.

02

Multi-head attention and normalization are shown as natural structural outcomes, not just architectural choices.

03

The framework bridges deep learning and many-body physics, enabling cross-domain insights.

Abstract

Transformer architectures are typically described in algorithmic and statistical terms, leaving their internal mechanics without a familiar structural language for researchers trained in physical theories. To bridge this gap, we develop a complementary operator-theoretic framework that recasts their mechanics in a language familiar to many-body physics. Beginning from the token as a discrete index without intrinsic geometry, we show that embedding corresponds to a basis transformation into a continuous representation space. Once such a reference basis is established, self-attention naturally assumes the role of a non-Hermitian interaction operator, and network depth implements an ordered composition of these interactions. Within this formulation, several empirical properties of deep Transformers -- including stability at large depth, representational saturation, and the effectiveness of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsModel Reduction and Neural Networks · Quantum many-body systems · Machine Learning in Materials Science