Exemplar Retrieval Without Overhypothesis Induction: Limits of Distributional Sequence Learning in Early Word Learning
Jon-Paul Cacioli

TL;DR
This study shows that autoregressive language models trained on synthetic data can perfectly retrieve exemplars but fail to generalize second-order features like shape in early word learning.
Contribution
It demonstrates the limitations of distributional sequence learning models in capturing second-order generalizations crucial for early language development.
Findings
Models achieved 100% accuracy in exemplar retrieval.
Models performed at chance (50-52%) on second-order generalization.
Models rely on template matching rather than structured abstraction.
Abstract
Background: Children do not simply learn that balls are round and blocks are square. They learn that shape is the kind of feature that tends to define object categories -- a second-order generalisation known as an overhypothesis [1, 2]. What kind of learning mechanism is sufficient for this inductive leap? Methods: We trained autoregressive transformer language models (3.4M-25.6M parameters) on synthetic corpora in which shape is the stable feature dimension across categories, with eight conditions controlling for alternative explanations. Results: Across 120 pre-registered runs evaluated on a 1,040-item wug test battery, every model achieved perfect first-order exemplar retrieval (100%) while second-order generalisation to novel nouns remained at chance (50-52%), a result confirmed by equivalence testing. A feature-swap diagnostic revealed that models rely on frame-to-feature template…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
