A Unified Geometric Field Theory Framework for Transformers: From Manifold Embeddings to Kernel Modulation
Xianshuai Shi, Jianfeng Zhu, and Leibo Liu

TL;DR
This paper introduces a unified geometric framework for Transformers, interpreting their components as kernel-modulated operators on manifolds, providing a new mathematical understanding of their core mechanisms.
Contribution
It develops a theoretical framework that unifies positional encoding, attention, and kernel operators within a geometric and field-theoretic perspective.
Findings
Maps discrete positions to continuous manifold functions
Provides a field-theoretic interpretation of Transformer layers
Lays groundwork for further theoretical analysis of Transformers
Abstract
The Transformer architecture has achieved tremendous success in natural language processing, computer vision, and scientific computing through its self-attention mechanism. However, its core components-positional encoding and attention mechanisms-have lacked a unified physical or mathematical interpretation. This paper proposes a structural theoretical framework that integrates positional encoding, kernel integral operators, and attention mechanisms for in-depth theoretical investigation. We map discrete positions (such as text token indices and image pixel coordinates) to spatial functions on continuous manifolds, enabling a field-theoretic interpretation of Transformer layers as kernel-modulated operators acting over embedded manifolds.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace Recognition and Perception · Handwritten Text Recognition Techniques · Ferroelectric and Negative Capacitance Devices
