Experiments of ASR-based mispronunciation detection for children and   adult English learners

Nina Hosseini-Kivanani; Roberto Gretter; Marco Matassoni; and Giuseppe; Daniele Falavigna

arXiv:2104.05980·cs.CL·April 14, 2021·5 cites

Experiments of ASR-based mispronunciation detection for children and adult English learners

Nina Hosseini-Kivanani, Roberto Gretter, Marco Matassoni, and Giuseppe, Daniele Falavigna

PDF

Open Access

TL;DR

This paper presents an ASR-based system for detecting mispronunciations in non-native English speakers, focusing on Italian learners, and demonstrates improved accuracy in identifying pronunciation errors using an error model.

Contribution

It introduces a phone-based ASR system with an error model tailored for non-native English pronunciation assessment, validated on Italian adult and child speech corpora.

Findings

01

Error model improves discrimination of correct and incorrect sounds.

02

ASR system accuracy increases with the error model.

03

Effective detection of pronunciation errors in non-native speech.

Abstract

Pronunciation is one of the fundamentals of language learning, and it is considered a primary factor of spoken language when it comes to an understanding and being understood by others. The persistent presence of high error rates in speech recognition domains resulting from mispronunciations motivates us to find alternative techniques for handling mispronunciations. In this study, we develop a mispronunciation assessment system that checks the pronunciation of non-native English speakers, identifies the commonly mispronounced phonemes of Italian learners of English, and presents an evaluation of the non-native pronunciation observed in phonetically annotated speech corpora. In this work, to detect mispronunciations, we used a phone-based ASR implemented using Kaldi. We used two non-native English labeled corpora; (i) a corpus of Italian adults contains 5,867 utterances from 46 speakers,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Phonetics and Phonology Research · Speech and Audio Processing