Attention's Gravitational Field:A Power-Law Interpretation of Positional Correlation
Edward Zhang

TL;DR
This paper introduces the Attention Gravitational Field (AGF), a novel interpretation of positional relationships in Large Language Models, linking them to a power-law and gravitational analogy to improve understanding and model performance.
Contribution
It presents the AGF concept, decouples positional encodings from semantics, and demonstrates its theoretical and empirical alignment with physical laws, advancing interpretability and optimization.
Findings
AGF aligns with Newton's Law of Universal Gravitation
Decoupling positional encodings improves model accuracy
AGF shows consistency with learning and stability curves
Abstract
This paper explores the underlying principles of positional relationships and encodings within Large Language Models (LLMs) and introduces the concept of the Attention Gravitational Field (AGF). By decoupling positional encodings from semantic embeddings, we optimize the model architecture and achieve superior accuracy compared to prevailing encoding methods. Furthermore, we provide an in-depth analysis of AGF, demonstrating its intrinsic consistency with learning and stability curves, as well as its empirical alignment with Newton's Law of Universal Gravitation. By offering a rigorous theoretical exploration of these phenomena, this work represents a significant step toward interpreting the Attention mechanism and unlocks new possibilities for future research in model optimization and interpretability.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Topic Modeling · Multimodal Machine Learning Applications
