A model of infant speech perception and learning

Philip Zurbuchen

arXiv:1610.06214·cs.SD·October 21, 2016

A model of infant speech perception and learning

Philip Zurbuchen

PDF

Open Access 1 Repo

TL;DR

This paper presents a computational model of infant speech perception and learning using neural networks and reinforcement learning, incorporating speech synthesis and a novel approach to speaker normalization.

Contribution

It introduces a new model combining Echo State Networks and reinforcement learning to simulate infant speech acquisition with a focus on speaker normalization.

Findings

01

The model successfully recognizes vowel sounds across different speakers.

02

Infant imitation is improved by caregiver involvement in the learning process.

03

A proposed method addresses speaker normalization in infant speech learning.

Abstract

Infant speech perception and learning is modeled using Echo State Network classification and Reinforcement Learning. Ambient speech for the modeled infant learner is created using the speech synthesizer Vocaltractlab. An auditory system is trained to recognize vowel sounds from a series of speakers of different anatomies in Vocaltractlab. Having formed perceptual targets, the infant uses Reinforcement Learning to imitate his ambient speech. A possible way of bridging the problem of speaker normalisation is proposed, using direct imitation but also including a caregiver who listens to the infants sounds and imitates those that sound vowel-like.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

PhilJZ/ListenAndBabble
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBlind Source Separation Techniques · Neural Networks and Applications · Speech and Audio Processing