Characterizing the Expressivity of Fixed-Precision Transformer Language Models

Jiaoda Li; Ryan Cotterell

arXiv:2505.23623·cs.CL·December 4, 2025

Characterizing the Expressivity of Fixed-Precision Transformer Language Models

Jiaoda Li, Ryan Cotterell

PDF

Open Access

TL;DR

This paper provides a theoretical characterization of fixed-precision transformer models' expressive power, linking it to linear temporal logic, and confirms the theory with empirical results on language generalization.

Contribution

It introduces a formal framework connecting transformer expressivity to linear temporal logic and demonstrates its practical implications through empirical validation.

Findings

01

Transformers are as expressive as a fragment of linear temporal logic.

02

Models generalize well within their expressive capacity.

03

Models fail to generalize on languages beyond their expressive power.

Abstract

Transformer-based language models (LMs) have achieved widespread empirical success, but their theoretical expressive power remains only partially understood. In this work, we analyze a restricted idealization of fixed-precision transformers with strict future masking, soft attention, and no positional encodings. We establish that this class of models is exactly as expressive as a specific fragment of linear temporal logic that contains only a single temporal operator: the past operator. We further connect this fragment to established classes in formal language theory, automata theory, and algebra, yielding a unified framework for understanding transformer expressivity under this idealization. Finally, we present empirical results that align closely with our theory: transformers trained on languages within their characterized expressive capacity generalize reliably across sequence…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Machine Learning and Algorithms

MethodsSoftmax · Attention Is All You Need · ALIGN