Meta Learning for End-to-End Low-Resource Speech Recognition
Jui-Yang Hsu, Yuan-Jui Chen, Hung-yi Lee

TL;DR
This paper introduces MetaASR, a meta-learning approach for low-resource speech recognition that adapts quickly to new languages by leveraging pretraining on multiple languages, outperforming existing methods.
Contribution
The paper applies model-agnostic meta learning (MAML) to low-resource speech recognition, demonstrating improved adaptation to unseen languages over traditional pretraining methods.
Findings
MetaASR outperforms state-of-the-art multitask pretraining on all target languages.
Meta-learning enables faster adaptation to new languages with limited data.
The approach opens new research directions for applying meta learning in speech applications.
Abstract
In this paper, we proposed to apply meta learning approach for low-resource automatic speech recognition (ASR). We formulated ASR for different languages as different tasks, and meta-learned the initialization parameters from many pretraining languages to achieve fast adaptation on unseen target language, via recently proposed model-agnostic meta learning algorithm (MAML). We evaluated the proposed approach using six languages as pretraining tasks and four languages as target tasks. Preliminary results showed that the proposed method, MetaASR, significantly outperforms the state-of-the-art multitask pretraining approach on all target languages with different combinations of pretraining languages. In addition, since MAML's model-agnostic property, this paper also opens new research direction of applying meta learning to more speech-related applications.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
