Transformers in Uniform TC$^0$

David Chiang

arXiv:2409.13629·cs.CC·January 6, 2025

Transformers in Uniform TC$^0$

David Chiang

PDF

Open Access

TL;DR

This paper proves that certain types of attention transformers, including exact and high-precision variants, are computationally within the class TC$^0$, extending previous results that relied on limited-precision assumptions.

Contribution

It improves prior work by showing that AHATs and SMATs with various levels of precision are exactly within DLOGTIME-uniform TC$^0$, removing the need for approximation.

Findings

01

AHATs with no approximation are in DLOGTIME-uniform TC$^0$

02

SMATs with polynomial bits of precision are in DLOGTIME-uniform TC$^0$

03

SMATs with exponentially small error are in DLOGTIME-uniform TC$^0$

Abstract

Previous work has shown that the languages recognized by average-hard attention transformers (AHATs) and softmax-attention transformers (SMATs) are within the circuit complexity class TC $^{0}$ . However, these results assume limited-precision arithmetic: using floating-point numbers with O(log n) bits (where n is the length of the input string), Strobl showed that AHATs can be approximated in L-uniform TC $^{0}$ , and Merrill and Sabharwal showed that SMATs can be approximated in DLOGTIME-uniform TC $^{0}$ . Here, we improve these results, showing that AHATs with no approximation, SMATs with O(poly(n)) bits of floating-point precision, and SMATs with at most $2^{- O (p o l y (n))}$ absolute error are all in DLOGTIME-uniform TC $^{0}$ .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsElectromagnetic Scattering and Analysis · Numerical methods for differential equations · Advanced Numerical Methods in Computational Mathematics

MethodsSoftmax · Attention Is All You Need