A Philosophical Introduction to Language Models -- Part I: Continuity With Classic Debates
Rapha\"el Milli\`ere, Cameron Buckner

TL;DR
This paper introduces the philosophical significance of large language models like GPT-4, discussing their relation to classic debates in cognition, language, and AI, and highlighting the need for further empirical research.
Contribution
It provides a philosophical primer and survey on language models, connecting them to longstanding debates and challenging assumptions about neural networks in cognition and language.
Findings
Language models challenge traditional views on compositionality and semantic competence.
They prompt new philosophical questions about cognition and cultural transmission.
Further empirical investigation is necessary to understand their internal mechanisms.
Abstract
Large language models like GPT-4 have achieved remarkable proficiency in a broad spectrum of language-based tasks, some of which are traditionally associated with hallmarks of human intelligence. This has prompted ongoing disagreements about the extent to which we can meaningfully ascribe any kind of linguistic or cognitive competence to language models. Such questions have deep philosophical roots, echoing longstanding debates about the status of artificial neural networks as cognitive models. This article -- the first part of two companion papers -- serves both as a primer on language models for philosophers, and as an opinionated survey of their significance in relation to classic debates in the philosophy cognitive science, artificial intelligence, and linguistics. We cover topics such as compositionality, language acquisition, semantic competence, grounding, world models, and the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Softmax · Adam · Layer Normalization · Residual Connection · Absolute Position Encodings · Dropout · Dense Connections
