Position: The Turing-Completeness of Real-World Autoregressive Transformers Relies Heavily on Context Management
Guanyu Cui, Zhewei Wei, Kun He

TL;DR
This paper clarifies that the Turing-completeness of real-world autoregressive Transformers heavily depends on how they manage context, distinguishing fixed-system from scaling-family settings.
Contribution
It formalizes the fixed-system setting for Transformers, clarifies the limitations of existing proofs, and emphasizes the critical role of context management in their computational power.
Findings
Existing proofs often rely on scaling-family assumptions.
Context management methods significantly affect computational power.
Real-world LLMs operate in a fixed-system setting, not the scaling-family setting.
Abstract
Many works make the eye-catching claim that Transformers are Turing-complete. However, the literature often conflates two distinct settings: (i) a fixed Transformer system setting, in which a fixed autoregressive Transformer is coupled with a fixed context-management method to process inputs of different lengths step by step, and (ii) a scaling-family setting, in which a family of different models (with increasing context-window length or numerical precision) is used to handle different input lengths. Existing proofs of Transformer Turing-completeness are frequently established in setting (ii), whereas real-world LLM deployment and the standard notion of Turing-completeness correspond more naturally to setting (i). In this paper, we first formalize the fixed-system setting, thereby providing a concrete characterization of how real-world LLMs operate. We then argue that results proved in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
