Language Models as Models of Language
Rapha\"el Milli\`ere

TL;DR
Modern language models, despite engineering origins, show promise in capturing complex linguistic structures, prompting a reassessment of their role in linguistic theory and potential for advancing understanding of language acquisition.
Contribution
This paper argues for the relevance of language models to linguistic theory and advocates for closer collaboration between linguists and computational researchers.
Findings
Language models can learn hierarchical syntactic structures.
Models exhibit sensitivity to various linguistic phenomena.
Empirical evidence supports their potential in linguistic research.
Abstract
This chapter critically examines the potential contributions of modern language models to theoretical linguistics. Despite their focus on engineering goals, these models' ability to acquire sophisticated linguistic knowledge from mere exposure to data warrants a careful reassessment of their relevance to linguistic theory. I review a growing body of empirical evidence suggesting that language models can learn hierarchical syntactic structure and exhibit sensitivity to various linguistic phenomena, even when trained on developmentally plausible amounts of data. While the competence/performance distinction has been invoked to dismiss the relevance of such models to linguistic theory, I argue that this assessment may be premature. By carefully controlling learning conditions and making use of causal intervention methods, experiments with language models can potentially constrain hypotheses…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques
MethodsFocus
