On the Anatomy of Attention
Nikhil Khatri, Tuomas Laakkonen, Jonathon Liu, Vincent, Wang-Ma\'scianica

TL;DR
This paper introduces a category-theoretic diagrammatic formalism to analyze and compare attention mechanisms in machine learning, providing a systematic way to understand their structure and variations.
Contribution
It develops a novel formalism for representing ML models diagrammatically and applies it to analyze attention mechanisms, creating a taxonomy and exploring variations.
Findings
Identified recurring components of attention mechanisms
Constructed a taxonomy of attention variants
Recombined components to explore new attention variations
Abstract
We introduce a category-theoretic diagrammatic formalism in order to systematically relate and reason about machine learning models. Our diagrams present architectures intuitively but without loss of essential detail, where natural relationships between models are captured by graphical transformations, and important differences and similarities can be identified at a glance. In this paper, we focus on attention mechanisms: translating folklore into mathematical derivations, and constructing a taxonomy of attention variants in the literature. As a first example of an empirical investigation underpinned by our formalism, we identify recurring anatomical components of attention, which we exhaustively recombine to explore a space of variations on the attention mechanism.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeuroscience, Education and Cognitive Function
MethodsSoftmax · Attention Is All You Need · Focus
