Expressivity of Transformers: A Tropical Geometry Perspective

Ye Su; Yong Liu

arXiv:2604.14727·cs.LG·April 17, 2026

Expressivity of Transformers: A Tropical Geometry Perspective

Ye Su, Yong Liu

PDF

TL;DR

This paper introduces a tropical geometry framework to analyze the expressivity of transformers, revealing their exact spatial partitioning capabilities and combinatorial complexity growth with network parameters.

Contribution

It models self-attention as a tropical rational map, establishes bounds on linear regions, and demonstrates the stability of these partitions under soft attention.

Findings

01

Transformers evaluate to Power Voronoi Diagrams in the zero-temperature limit.

02

Multi-head self-attention increases polyhedral complexity exponentially with the number of heads.

03

Number of linear regions scales as Θ(N^{d_model}L), showing combinatorial explosion.

Abstract

To quantify the geometric expressivity of transformers, we introduce a tropical geometry framework to characterize their exact spatial partitioning capabilities. By modeling self-attention as a vector-valued tropical rational map, we prove it evaluates exactly to a Power Voronoi Diagram in the zero-temperature limit. Building on this equivalence, we establish a combinatorial rationale for Multi-Head Self-Attention (MHSA): via the Minkowski sum of Newton polytopes, multi-head aggregation expands the polyhedral complexity to $O (N^{H})$ , overcoming the $O (N)$ bottleneck of single heads. Extending this to deep architectures, we derive the first tight asymptotic bounds on the number of linear regions in transformers ( $Θ (N^{d_{model} L})$ ), demonstrating a combinatorial explosion driven intrinsically by sequence length $N$ , ambient embedding dimension…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.