Transfer Learning from Adult to Children for Speech Recognition:   Evaluation, Analysis and Recommendations

Prashanth Gurunath Shivakumar; Panayiotis Georgiou

arXiv:1805.03322·eess.AS·May 15, 2018

Transfer Learning from Adult to Children for Speech Recognition: Evaluation, Analysis and Recommendations

Prashanth Gurunath Shivakumar, Panayiotis Georgiou

PDF

TL;DR

This paper investigates transfer learning from adult to children speech recognition using DNNs, analyzing various adaptation techniques and parameters to improve recognition accuracy across different child age groups.

Contribution

It provides a comprehensive analysis of transfer learning strategies for children's speech recognition, including evaluation of adaptation configurations and recommendations for future research.

Findings

01

Transfer learning outperforms standard adaptation techniques.

02

Optimal adaptation depends on data amount and child's age.

03

Recommendations improve recognition accuracy across diverse child speech data.

Abstract

Children speech recognition is challenging mainly due to the inherent high variability in children's physical and articulatory characteristics and expressions. This variability manifests in both acoustic constructs and linguistic usage due to the rapidly changing developmental stage in children's life. Part of the challenge is due to the lack of large amounts of available children speech data for efficient modeling. This work attempts to address the key challenges using transfer learning from adult's models to children's models in a Deep Neural Network (DNN) framework for children's Automatic Speech Recognition (ASR) task evaluating on multiple children's speech corpora with a large vocabulary. The paper presents a systematic and an extensive analysis of the proposed transfer learning technique considering the key factors affecting children's speech recognition from prior literature.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.