What do language models model? Transformers, automata, and the format of thought

Colin Klein

arXiv:2508.18598·cs.CL·August 27, 2025

What do language models model? Transformers, automata, and the format of thought

Colin Klein

PDF

TL;DR

This paper defends the view that large language models primarily model the training corpus rather than human cognition, analyzing their architecture and proposing they function as 'discourse machines' akin to automata.

Contribution

It offers a non-deflationary argument that transformers process information in linear formats, contrasting with human supralinear cognitive formats, and discusses their role as 'discourse machines' based on automata theory.

Findings

01

Transformers support linear processing formats.

02

Language models function as 'discourse machines' similar to automata.

03

The architecture's invariants suggest models primarily reflect training data.

Abstract

What do large language models actually model? Do they tell us something about human capacities, or are they models of the corpus we've trained them on? I give a non-deflationary defence of the latter position. Cognitive science tells us that linguistic capabilities in humans rely supralinear formats for computation. The transformer architecture, by contrast, supports at best a linear formats for processing. This argument will rely primarily on certain invariants of the computational architecture of transformers. I then suggest a positive story about what transformers are doing, focusing on Liu et al. (2022)'s intriguing speculations about shortcut automata. I conclude with why I don't think this is a terribly deflationary story. Language is not (just) a means for expressing inner state but also a kind of 'discourse machine' that lets us make new language given appropriate context. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.