Order matters: Distributional properties of speech to young children bootstraps learning of semantic representations
Philip A Huebner, Jon A Willits

TL;DR
This study investigates how the order of linguistic input affects language learning in neural models, finding that age-ordered input improves semantic learning but not syntax, highlighting the importance of input sequencing.
Contribution
It provides empirical evidence that training on age-ordered speech enhances semantic acquisition in neural networks, supporting the hypothesis that order matters in language learning.
Findings
Age-ordered CHILDES shows increasing linguistic complexity.
Training on ordered input improves semantic learning.
Removing utterance boundary info negates the advantage.
Abstract
Some researchers claim that language acquisition is critically dependent on experiencing linguistic input in order of increasing complexity. We set out to test this hypothesis using a simple recurrent neural network (SRN) trained to predict word sequences in CHILDES, a 5-million-word corpus of speech directed to children. First, we demonstrated that age-ordered CHILDES exhibits a gradual increase in linguistic complexity. Next, we compared the performance of two groups of SRNs trained on CHILDES which had either been age-ordered or not. Specifically, we assessed learning of grammatical and semantic structure and showed that training on age-ordered input facilitates learning of semantic, but not of sequential structure. We found that this advantage is eliminated when the models were trained on input with utterance boundary information removed.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLanguage Development and Disorders · Speech and dialogue systems · Neurobiology of Language and Bilingualism
