Language-Routing Mixture of Experts for Multilingual and Code-Switching Speech Recognition
Wenxuan Wang, Guodong Ma, Yuke Li, Binbin Du

TL;DR
This paper introduces LR-MoE, a computation-efficient multilingual and code-switching speech recognition model that uses language-specific experts and a frame-wise routing mechanism to improve accuracy while maintaining efficiency.
Contribution
The paper proposes LR-MoE, a novel language-routing MoE model that enhances multilingual and code-switching speech recognition with reduced computational complexity.
Findings
Significant performance improvements over baseline models.
Maintains computational efficiency comparable to existing methods.
Effective language-specific representation learning through MoE.
Abstract
Multilingual speech recognition for both monolingual and code-switching speech is a challenging task. Recently, based on the Mixture of Experts (MoE), many works have made good progress in multilingual and code-switching ASR, but present huge computational complexity with the increase of supported languages. In this work, we propose a computation-efficient network named Language-Routing Mixture of Experts (LR-MoE) for multilingual and code-switching ASR. LR-MoE extracts language-specific representations through the Mixture of Language Experts (MLE), which is guided to learn by a frame-wise language routing mechanism. The weight-shared frame-level language identification (LID) network is jointly trained as the shared pre-router of each MoE layer. Experiments show that the proposed method significantly improves multilingual and code-switching speech recognition performances over baseline…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Phonetics and Phonology Research
