Statistical Learning in Speech: A Biologically Based Predictive Learning Model
John Rohrlich, Randall C. O'Reilly

TL;DR
This paper presents a biologically plausible neural network model that learns speech patterns through predictive error-driven learning, successfully simulating infant statistical learning and word segmentation from speech data.
Contribution
It introduces a novel biologically inspired model of predictive learning that explains how statistical learning in speech can emerge from local neural mechanisms.
Findings
The model predicts in-word syllables better than next-word syllables.
It successfully simulates infant statistical learning studies.
The approach supports prediction as a basis for word segmentation.
Abstract
Infants, adults, non-human primates and non-primates all learn patterns implicitly, and they do so across modalities. The biological evidence supports the hypothesis that the mechanism for this learning is general but computationally local. We hypothesize that the mechanism itself is predictive error-driven learning. We build on recent work that advanced a biologically plausible model of error backpropagation learning which proposes that higher order thalamic nuclei provide a locale for a temporal difference between top-down predictions and an actual event outcome. Our neural network based on that work also models the auditory cortex hierarchy of core, belt and parabelt and the caudal-rostral axis within regions. We simulated two studies showing statistical learning in infants, a seminal study using synthesized speech and a more recent study using human speech. Before simulating these…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLanguage Development and Disorders · Phonetics and Phonology Research · Speech and Audio Processing
