Humans and transformer LMs: Abstraction drives language learning
Jasper Jian, Christopher D. Manning

TL;DR
This paper explores how transformer language models develop linguistic categories during training, revealing that abstract features emerge earlier than specific lexical items, highlighting the role of abstraction in language learning.
Contribution
It demonstrates that transformer LMs exhibit sequential emergence of linguistic behaviors, with abstraction playing a key role, providing insights into human language acquisition models.
Findings
Abstract class-level behavior appears earlier than lexical item-specific behavior.
Different linguistic behaviors emerge abruptly at different training stages.
Abstraction significantly influences how LMs learn language.
Abstract
Categorization is a core component of human linguistic competence. We investigate how a transformer-based language model (LM) learns linguistic categories by comparing its behaviour over the course of training to behaviours which characterize abstract feature-based and concrete exemplar-based accounts of human language acquisition. We investigate how lexical semantic and syntactic categories emerge using novel divergence-based metrics that track learning trajectories using next-token distributions. In experiments with GPT-2 small, we find that (i) when a construction is learned, abstract class-level behaviour is evident at earlier steps than lexical item-specific behaviour, and (ii) that different linguistic behaviours emerge abruptly in sequence at different points in training, revealing that abstraction plays a key role in how LMs learn. This result informs the models of human…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsLanguage Development and Disorders · Neurobiology of Language and Bilingualism · Language and cultural evolution
