SC-MoE: Switch Conformer Mixture of Experts for Unified Streaming and Non-streaming Code-Switching ASR
Shuaishuai Ye, Shunfei Chen, Xinhui Hu, and Xinkang Xu

TL;DR
This paper introduces SC-MoE, a unified model for streaming and non-streaming code-switching ASR, utilizing a Switch-Conformer MoE architecture with language-specific experts and routers for improved recognition accuracy.
Contribution
The paper presents a novel Switch-Conformer MoE system with integrated language experts and routers for real-time, unified code-switching speech recognition, enhancing performance over existing baselines.
Findings
Significant improvement in CS ASR accuracy over baseline models.
Effective integration of language experts and routers in MoE layers.
Maintained computational efficiency with enhanced recognition performance.
Abstract
In this work, we propose a Switch-Conformer-based MoE system named SC-MoE for unified streaming and non-streaming code-switching (CS) automatic speech recognition (ASR), where we design a streaming MoE layer consisting of three language experts, which correspond to Mandarin, English, and blank, respectively, and equipped with a language identification (LID) network with a Connectionist Temporal Classification (CTC) loss as a router in the encoder of SC-MoE to achieve a real-time streaming CS ASR system. To further utilize the language information embedded in text, we also incorporate MoE layers into the decoder of SC-MoE. In addition, we introduce routers into every MoE layer of the encoder and the decoder and achieve better recognition performance. Experimental results show that the SC-MoE significantly improves CS ASR performances over baseline with comparable computational efficiency.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed Sensor Networks and Detection Algorithms · Target Tracking and Data Fusion in Sensor Networks · Blind Source Separation Techniques
MethodsMixture of Experts
