A meta learning scheme for fast accent domain expansion in Mandarin   speech recognition

Ziwei Zhu; Changhao Shan; Bihong Zhang; Jian Yu

arXiv:2307.12262·cs.SD·July 25, 2023

A meta learning scheme for fast accent domain expansion in Mandarin speech recognition

Ziwei Zhu, Changhao Shan, Bihong Zhang, Jian Yu

PDF

Open Access

TL;DR

This paper introduces a meta-learning approach for rapid accent domain expansion in Mandarin speech recognition, significantly improving performance and training efficiency without compromising baseline accuracy.

Contribution

It combines meta-learning with parameter freezing to enhance accent domain adaptation in Mandarin ASR, achieving faster training and better accuracy.

Findings

01

Outperforms other methods by 3% in accent domain expansion

02

Improves baseline performance by 37% on Mandarin test set

03

Reduces training time by approximately 20%

Abstract

Spoken languages show significant variation across mandarin and accent. Despite the high performance of mandarin automatic speech recognition (ASR), accent ASR is still a challenge task. In this paper, we introduce meta-learning techniques for fast accent domain expansion in mandarin speech recognition, which expands the field of accents without deteriorating the performance of mandarin ASR. Meta-learning or learn-to-learn can learn general relation in multi domains not only for over-fitting a specific domain. So we select meta-learning in the domain expansion task. This more essential learning will cause improved performance on accent domain extension tasks. We combine the methods of meta learning and freeze of model parameters, which makes the recognition performance more stable in different cases and the training faster about 20%. Our approach significantly outperforms other methods…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Music and Audio Processing · Speech and Audio Processing