Autoregressive Large Language Models are Computationally Universal

Dale Schuurmans; Hanjun Dai; Francesco Zanini

arXiv:2410.03170·cs.CL·October 7, 2024·2 cites

Autoregressive Large Language Models are Computationally Universal

Dale Schuurmans, Hanjun Dai, Francesco Zanini

PDF

Open Access

TL;DR

This paper demonstrates that autoregressive large language models, specifically gemini-1.5-pro-001, can perform universal computation by simulating a Turing machine through a novel Lag system framework.

Contribution

It introduces a new theoretical framework linking autoregressive decoding to universal computation and shows that existing models can simulate a universal Turing machine.

Findings

01

Autoregressive models can realize universal computation without weight modification.

02

A single prompt can drive the model to simulate a universal Lag system.

03

The model can implement a Turing machine through extended autoregressive decoding.

Abstract

We show that autoregressive decoding of a transformer-based language model can realize universal computation, without external intervention or modification of the model's weights. Establishing this result requires understanding how a language model can process arbitrarily long inputs using a bounded context. For this purpose, we consider a generalization of autoregressive decoding where, given a long input, emitted tokens are appended to the end of the sequence as the context window advances. We first show that the resulting system corresponds to a classical model of computation, a Lag system, that has long been known to be computationally universal. By leveraging a new proof, we show that a universal Turing machine can be simulated by a Lag system with 2027 production rules. We then investigate whether an existing large language model can simulate the behaviour of such a universal Lag…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques