Non-native children speech recognition through transfer learning

Marco Matassoni; Roberto Gretter; Daniele Falavigna; Diego Giuliani

arXiv:1809.09658·cs.CL·September 27, 2018

Non-native children speech recognition through transfer learning

Marco Matassoni, Roberto Gretter, Daniele Falavigna, Diego Giuliani

PDF

TL;DR

This paper explores transfer learning techniques to adapt multi-language DNN speech recognition models for non-native children, improving accuracy with limited non-native speech data across English, German, and Italian.

Contribution

It introduces transfer learning methods for adapting multi-lingual DNNs to non-native children's speech, demonstrating significant performance improvements.

Findings

01

Transfer learning improves non-native speech recognition accuracy.

02

Multi-lingual models outperform mono-lingual adapted systems.

03

Effective adaptation is achieved with limited non-native speech data.

Abstract

This work deals with non-native children's speech and investigates both multi-task and transfer learning approaches to adapt a multi-language Deep Neural Network (DNN) to speakers, specifically children, learning a foreign language. The application scenario is characterized by young students learning English and German and reading sentences in these second-languages, as well as in their mother language. The paper analyzes and discusses techniques for training effective DNN-based acoustic models starting from children native speech and performing adaptation with limited non-native audio material. A multi-lingual model is adopted as baseline, where a common phonetic lexicon, defined in terms of the units of the International Phonetic Alphabet (IPA), is shared across the three languages at hand (Italian, German and English); DNN adaptation methods based on transfer learning are evaluated…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.