Combined Acoustic and Pronunciation Modelling for Non-Native Speech   Recognition

Ghazi Bouselmi (INRIA Lorraine - LORIA); Dominique Fohr (INRIA; Lorraine - LORIA); Irina Illina (INRIA Lorraine - LORIA)

arXiv:0711.0811·cs.CL·November 7, 2007·4 cites

Combined Acoustic and Pronunciation Modelling for Non-Native Speech Recognition

Ghazi Bouselmi (INRIA Lorraine - LORIA), Dominique Fohr (INRIA, Lorraine - LORIA), Irina Illina (INRIA Lorraine - LORIA)

PDF

Open Access

TL;DR

This paper introduces combined acoustic and pronunciation adaptation techniques for non-native speech recognition, significantly improving accuracy on foreign accented English speech.

Contribution

It presents a novel phonetic confusion scheme and demonstrates the effectiveness of combining multiple adaptation methods for non-native speech recognition.

Findings

01

Up to 71% relative word error reduction

02

Pronunciation modelling combined with acoustic adaptation improves accuracy

03

Effective adaptation techniques for foreign accented English speech

Abstract

In this paper, we present several adaptation methods for non-native speech recognition. We have tested pronunciation modelling, MLLR and MAP non-native pronunciation adaptation and HMM models retraining on the HIWIRE foreign accented English speech database. The ``phonetic confusion'' scheme we have developed consists in associating to each spoken phone several sequences of confused phones. In our experiments, we have used different combinations of acoustic models representing the canonical and the foreign pronunciations: spoken and native models, models adapted to the non-native accent with MAP and MLLR. The joint use of pronunciation modelling and acoustic adaptation led to further improvements in recognition accuracy. The best combination of the above mentioned techniques resulted in a relative word error reduction ranging from 46% to 71%.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing