Filling in the Mechanisms: How do LMs Learn Filler-Gap Dependencies under Developmental Constraints?
Atrey Desai, Sathvik Nair

TL;DR
This study investigates how language models learn filler-gap dependencies with limited data, revealing they develop shared mechanisms but need more data than humans, emphasizing the importance of language-specific biases.
Contribution
It demonstrates that LMs can develop shared filler-gap representations with limited data but require more data than humans, highlighting the need for biases in models.
Findings
Shared mechanisms may develop with limited training data.
Language models need more data than humans for similar generalizations.
Language-specific biases are crucial for efficient language acquisition.
Abstract
For humans, filler-gap dependencies require a shared representation across different syntactic constructions. Although causal analyses suggest this may also be true for LLMs (Boguraev et al., 2025), it is still unclear if such a representation also exists for language models trained on developmentally feasible quantities of data. We applied Distributed Alignment Search (DAS, Geiger et al. (2024)) to LMs trained on varying amounts of data from the BabyLM challenge (Warstadt et al., 2023), to evaluate whether representations of filler-gap dependencies transfer between wh-questions and topicalization, which greatly vary in terms of their input frequency. Our results suggest shared, yet item-sensitive mechanisms may develop with limited training data. More importantly, LMs still require far more data than humans to learn comparable generalizations, highlighting the need for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
