Exact Expressive Power of Transformers with Padding

William Merrill; Ashish Sabharwal

arXiv:2505.18948·cs.LG·November 7, 2025

Exact Expressive Power of Transformers with Padding

William Merrill, Ashish Sabharwal

PDF

Open Access 1 Video

TL;DR

This paper characterizes the exact computational power of padded transformers, showing they can recognize complex classes of problems with parallelizable inference, offering an efficient alternative to chain of thought reasoning.

Contribution

It provides a formal complexity-theoretic analysis of padded transformers, establishing their expressive power and introducing new tools for understanding their capabilities.

Findings

01

Padded transformers recognize the class FO-uniform TC^0.

02

With logarithmic looping, they recognize FO-uniform NC.

03

Padding and looping expand transformer expressiveness systematically.

Abstract

Chain of thought is a natural inference-time method for increasing the computational power of transformer-based large language models (LLMs), but comes at the cost of sequential decoding. Are there more efficient alternatives to expand a transformer's expressive power without adding parameters? We consider transformers with padding tokens as a form of parallelizable test-time compute. We show that averaging-hard-attention, masked-pre-norm transformers with polynomial padding recognize precisely the class $FO$ -uniform $TC^{0}$ of extremely parallelizable problems. While the $TC^{0}$ upper bound was known, proving a matching lower bound had been elusive. Further, our novel analysis reveals the precise expanded power of padded transformers when coupled with another form of inference-time compute, namely dynamically increasing depth via looping. Our core technical…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Exact Expressive Power of Transformers with Padding· slideslive

Taxonomy

TopicsAdvanced Materials and Mechanics · Advanced Memory and Neural Computing