Pronunciation Generation for Foreign Language Words in Intra-Sentential   Code-Switching Speech Recognition

Wei Wang; Chao Zhang; Xiaopei Wu

arXiv:2210.14691·cs.SD·October 27, 2022

Pronunciation Generation for Foreign Language Words in Intra-Sentential Code-Switching Speech Recognition

Wei Wang, Chao Zhang, Xiaopei Wu

PDF

Open Access

TL;DR

This paper presents a data-driven approach to improve intra-sentential code-switching speech recognition by generating foreign language pronunciations using a phonetic decoding and selection methods, significantly reducing error rates.

Contribution

It introduces a novel data-driven method for generating foreign language pronunciations to enhance code-switching speech recognition with limited data.

Findings

01

Reduced Mixed Error Rate from 29.15% to 11.14%

02

Effective use of phonetic decoding and selection methods

03

Improved recognition accuracy for foreign words in code-switching speech

Abstract

Code-Switching refers to the phenomenon of switching languages within a sentence or discourse. However, limited code-switching , different language phoneme-sets and high rebuilding costs throw a challenge to make the specialized acoustic model for code-switching speech recognition. In this paper, we make use of limited code-switching data as driving materials and explore a shortcut to quickly develop intra-sentential code-switching recognition skill on the commissioned native language acoustic model, where we propose a data-driven method to make the seed lexicon which is used to train grapheme-to-phoneme model to predict mapping pronunciations for foreign language word in code-switching sentences. The core work of the data-driven technology in this paper consists of a phonetic decoding method and different selection methods. And for imbalanced word-level driving materials problem, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis