Characterizing the Expressivity of Fixed-Precision Transformer Language Models
Jiaoda Li, Ryan Cotterell

TL;DR
This paper provides a theoretical characterization of fixed-precision transformer models' expressive power, linking it to linear temporal logic, and confirms the theory with empirical results on language generalization.
Contribution
It introduces a formal framework connecting transformer expressivity to linear temporal logic and demonstrates its practical implications through empirical validation.
Findings
Transformers are as expressive as a fragment of linear temporal logic.
Models generalize well within their expressive capacity.
Models fail to generalize on languages beyond their expressive power.
Abstract
Transformer-based language models (LMs) have achieved widespread empirical success, but their theoretical expressive power remains only partially understood. In this work, we analyze a restricted idealization of fixed-precision transformers with strict future masking, soft attention, and no positional encodings. We establish that this class of models is exactly as expressive as a specific fragment of linear temporal logic that contains only a single temporal operator: the past operator. We further connect this fragment to established classes in formal language theory, automata theory, and algebra, yielding a unified framework for understanding transformer expressivity under this idealization. Finally, we present empirical results that align closely with our theory: transformers trained on languages within their characterized expressive capacity generalize reliably across sequence…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Machine Learning and Algorithms
MethodsSoftmax · Attention Is All You Need · ALIGN
