A Distributional Perspective on Word Learning in Neural Language Models
Filippo Ficarra, Ryan Cotterell, Alex Warstadt

TL;DR
This paper introduces a distributional approach to analyze how neural language models learn words, proposing new metrics that better capture lexical knowledge and comparing these trajectories to human child learning, revealing significant differences.
Contribution
It develops improved distributional signatures for assessing word learning in language models and systematically compares model trajectories to human data, highlighting key methodological insights.
Findings
Distributional signatures better capture lexical knowledge.
Metrics show models' learning trajectories differ from children.
Multiple metrics provide complementary insights.
Abstract
Language models (LMs) are increasingly being studied as models of human language learners. Due to the nascency of the field, it is not well-established whether LMs exhibit similar learning dynamics to humans, and there are few direct comparisons between learning trajectories in humans and models. Word learning trajectories for children are relatively well-documented, and recent work has tried to extend these investigations to language models. However, there are no widely agreed-upon metrics for word learning in language models. We take a distributional approach to this problem, defining lexical knowledge in terms of properties of the learned distribution for a target word. We argue that distributional signatures studied in prior work fail to capture key distributional information. Thus, we propose an array of signatures that improve on earlier approaches by capturing knowledge of both…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsNatural Language Processing Techniques
