Universal computation is intrinsic to language model decoding
Alex Lewandowski, Marlos C. Machado, Dale Schuurmans

TL;DR
This paper proves that language models inherently possess the capacity for universal computation through their autoregressive decoding, and training primarily enhances their programmability rather than their computational power.
Contribution
It establishes that language models can perform universal computation intrinsically and that training improves prompt-based programmability instead of computational ability.
Findings
Randomly initialized models are capable of universal computation.
Training enhances the ease of eliciting computational behavior via prompts.
Language models' computational power is intrinsic, not acquired through training.
Abstract
Language models now provide an interface to express and often solve general problems in natural language, yet their ultimate computational capabilities remain a major topic of scientific debate. Unlike a formal computer, a language model is trained to autoregressively predict successive elements in human-generated text. We prove that chaining a language model's autoregressive output is sufficient to perform universal computation. That is, a language model can simulate the execution of any algorithm on any input. The challenge of eliciting desired computational behaviour can thus be reframed in terms of programmability: the ease of finding a suitable prompt. Strikingly, we demonstrate that even randomly initialized language models are capable of universal computation before training. This implies that training does not give rise to computational expressiveness -- rather, it improves…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Machine Learning and Algorithms
