XCB: an effective contextual biasing approach to bias cross-lingual phrases in speech recognition
Xucheng Wan, Naijun Zheng, Kai Liu, Huan Zhou

TL;DR
This paper introduces XCB, a novel cross-lingual biasing method that improves code-switching speech recognition by enhancing recognition of secondary language phrases without extra inference costs.
Contribution
The study proposes a Cross-lingual Contextual Biasing (XCB) module that augments pre-trained ASR models for better bilingual phrase recognition in code-switching scenarios.
Findings
Significant improvement in recognizing secondary language phrases.
Effective on in-house and unseen test datasets.
No additional inference overhead.
Abstract
Contextualized ASR models have been demonstrated to effectively improve the recognition accuracy of uncommon phrases when a predefined phrase list is available. However, these models often struggle with bilingual settings, which are prevalent in code-switching speech recognition. In this study, we make the initial attempt to address this challenge by introducing a Cross-lingual Contextual Biasing(XCB) module. Specifically, we augment a pre-trained ASR model for the dominant language by integrating an auxiliary language biasing module and a supplementary language-specific loss, aimed at enhancing the recognition of phrases in the secondary language. Experimental results conducted on our in-house code-switching dataset have validated the efficacy of our approach, demonstrating significant improvements in the recognition of biasing phrases in the secondary language, even without any…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis
