Repetition Without Exclusivity: Scale Sensitivity of Referential Mechanisms in Child-Scale Language Models
Jon-Paul Cacioli

TL;DR
This paper systematically evaluates referential mechanisms in child-scale language models, revealing that they produce repetition-based reference tracking rather than lexical exclusivity, challenging assumptions about innate biases.
Contribution
It demonstrates that distributional learning in child-directed speech leads to repetition priming instead of mutual exclusivity in language models, highlighting the importance of input structure.
Findings
Repetition priming is prevalent across models and decreases with better language modeling.
Models show no sensitivity to multi-sentence referential context.
Repetition effects are explained by embedding similarity, not referential disambiguation.
Abstract
We present the first systematic evaluation of mutual exclusivity (ME) -- the bias to map novel words to novel referents -- in text-only language models trained on child-directed speech. We operationalise ME as referential suppression: when a familiar object is relabelled in a two-referent discourse context, ME predicts decreased probability of the labelled noun at a subsequent completion position. Three pilot findings motivate a pre-registered scale-sensitivity experiment: (1) a masked language model (BabyBERTa) is entirely insensitive to multi-sentence referential context; (2) autoregressive models show robust repetition priming -- the opposite of ME -- when familiar nouns are re-labelled; and (3) a novel context-dependence diagnostic reveals that apparent ME-like patterns with nonce tokens are fully explained by embedding similarity, not referential disambiguation. In the confirmatory…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeurobiology of Language and Bilingualism · Language Development and Disorders · Topic Modeling
