Exemplar Retrieval Without Overhypothesis Induction: Limits of Distributional Sequence Learning in Early Word Learning

Jon-Paul Cacioli

arXiv:2604.05243·cs.CL·April 8, 2026

Exemplar Retrieval Without Overhypothesis Induction: Limits of Distributional Sequence Learning in Early Word Learning

Jon-Paul Cacioli

PDF

TL;DR

This study shows that autoregressive language models trained on synthetic data can perfectly retrieve exemplars but fail to generalize second-order features like shape in early word learning.

Contribution

It demonstrates the limitations of distributional sequence learning models in capturing second-order generalizations crucial for early language development.

Findings

01

Models achieved 100% accuracy in exemplar retrieval.

02

Models performed at chance (50-52%) on second-order generalization.

03

Models rely on template matching rather than structured abstraction.

Abstract

Background: Children do not simply learn that balls are round and blocks are square. They learn that shape is the kind of feature that tends to define object categories -- a second-order generalisation known as an overhypothesis [1, 2]. What kind of learning mechanism is sufficient for this inductive leap? Methods: We trained autoregressive transformer language models (3.4M-25.6M parameters) on synthetic corpora in which shape is the stable feature dimension across categories, with eight conditions controlling for alternative explanations. Results: Across 120 pre-registered runs evaluated on a 1,040-item wug test battery, every model achieved perfect first-order exemplar retrieval (100%) while second-order generalisation to novel nouns remained at chance (50-52%), a result confirmed by equivalence testing. A feature-swap diagnostic revealed that models rely on frame-to-feature template…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.