Mispronunciation Detection in Non-native (L2) English with Uncertainty   Modeling

Daniel Korzekwa; Jaime Lorenzo-Trueba; Szymon Zaporowski; Shira; Calamaro; Thomas Drugman; Bozena Kostek

arXiv:2101.06396·eess.AS·February 10, 2021

Mispronunciation Detection in Non-native (L2) English with Uncertainty Modeling

Daniel Korzekwa, Jaime Lorenzo-Trueba, Szymon Zaporowski, Shira, Calamaro, Thomas Drugman, Bozena Kostek

PDF

TL;DR

This paper introduces an uncertainty-aware model for detecting mispronunciations in non-native English speech, addressing recognition inaccuracies and multiple valid pronunciations to improve detection precision.

Contribution

It presents a novel approach that incorporates uncertainty modeling and multiple pronunciation variants, advancing beyond traditional single-pronunciation recognition methods.

Findings

01

Up to 18% relative increase in detection precision

02

Effective handling of pronunciation variability

03

Improved false alarm reduction

Abstract

A common approach to the automatic detection of mispronunciation in language learning is to recognize the phonemes produced by a student and compare it to the expected pronunciation of a native speaker. This approach makes two simplifying assumptions: a) phonemes can be recognized from speech with high accuracy, b) there is a single correct way for a sentence to be pronounced. These assumptions do not always hold, which can result in a significant amount of false mispronunciation alarms. We propose a novel approach to overcome this problem based on two principles: a) taking into account uncertainty in the automatic phoneme recognition step, b) accounting for the fact that there may be multiple valid pronunciations. We evaluate the model on non-native (L2) English speech of German, Italian and Polish speakers, where it is shown to increase the precision of detecting mispronunciations by…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.