A meta learning scheme for fast accent domain expansion in Mandarin speech recognition
Ziwei Zhu, Changhao Shan, Bihong Zhang, Jian Yu

TL;DR
This paper introduces a meta-learning approach for rapid accent domain expansion in Mandarin speech recognition, significantly improving performance and training efficiency without compromising baseline accuracy.
Contribution
It combines meta-learning with parameter freezing to enhance accent domain adaptation in Mandarin ASR, achieving faster training and better accuracy.
Findings
Outperforms other methods by 3% in accent domain expansion
Improves baseline performance by 37% on Mandarin test set
Reduces training time by approximately 20%
Abstract
Spoken languages show significant variation across mandarin and accent. Despite the high performance of mandarin automatic speech recognition (ASR), accent ASR is still a challenge task. In this paper, we introduce meta-learning techniques for fast accent domain expansion in mandarin speech recognition, which expands the field of accents without deteriorating the performance of mandarin ASR. Meta-learning or learn-to-learn can learn general relation in multi domains not only for over-fitting a specific domain. So we select meta-learning in the domain expansion task. This more essential learning will cause improved performance on accent domain extension tasks. We combine the methods of meta learning and freeze of model parameters, which makes the recognition performance more stable in different cases and the training faster about 20%. Our approach significantly outperforms other methods…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Music and Audio Processing · Speech and Audio Processing
