On the Anatomy of Attention

Nikhil Khatri; Tuomas Laakkonen; Jonathon Liu; Vincent; Wang-Ma\'scianica

arXiv:2407.02423·cs.LG·July 9, 2024

On the Anatomy of Attention

Nikhil Khatri, Tuomas Laakkonen, Jonathon Liu, Vincent, Wang-Ma\'scianica

PDF

Open Access

TL;DR

This paper introduces a category-theoretic diagrammatic formalism to analyze and compare attention mechanisms in machine learning, providing a systematic way to understand their structure and variations.

Contribution

It develops a novel formalism for representing ML models diagrammatically and applies it to analyze attention mechanisms, creating a taxonomy and exploring variations.

Findings

01

Identified recurring components of attention mechanisms

02

Constructed a taxonomy of attention variants

03

Recombined components to explore new attention variations

Abstract

We introduce a category-theoretic diagrammatic formalism in order to systematically relate and reason about machine learning models. Our diagrams present architectures intuitively but without loss of essential detail, where natural relationships between models are captured by graphical transformations, and important differences and similarities can be identified at a glance. In this paper, we focus on attention mechanisms: translating folklore into mathematical derivations, and constructing a taxonomy of attention variants in the literature. As a first example of an empirical investigation underpinned by our formalism, we identify recurring anatomical components of attention, which we exhaustively recombine to explore a space of variations on the attention mechanism.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeuroscience, Education and Cognitive Function

MethodsSoftmax · Attention Is All You Need · Focus