Constant Bit-size Transformers Are Turing Complete
Qian Li, Yuyi Wang

TL;DR
This paper demonstrates that constant bit-size transformers can simulate any Turing machine with sufficiently long context windows, revealing their Turing completeness and aligning their capabilities with classical computational models.
Contribution
It proves that constant bit-size transformers are Turing complete and characterizes their expressive power using the SPACE complexity class, advancing understanding of their computational capabilities.
Findings
Transformers can simulate Turing machines with constant bit-size inputs.
The expressive power of such transformers matches the SPACE complexity class.
Behavioral similarity to Post machines offers new insights into transformer reasoning.
Abstract
We prove that any Turing machine running on inputs of arbitrary length can be simulated by a constant bit-size transformer, as long as the context window is sufficiently long. This improves previous works, which require scaling up either the model's precision or the number of parameters on longer inputs. Furthermore, we prove that the complexity class SPACE exactly characterizes the expressive power of a constant bit-size transformer with a context window of length . Our approach relies on simulating Post machines, a Turing-complete computational model. Post machines can be modeled as automata equipped with a queue, exhibiting computational behaviors naturally aligned with those of transformers. The behavioral similarity between transformers and Post machines may offer new insights into the mechanisms underlying the reasoning abilities of transformers.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsQuantum Computing Algorithms and Architecture · Advanced Memory and Neural Computing · Quantum-Dot Cellular Automata
