Meta Learning for End-to-End Low-Resource Speech Recognition

Jui-Yang Hsu; Yuan-Jui Chen; Hung-yi Lee

arXiv:1910.12094·cs.SD·October 29, 2019

Meta Learning for End-to-End Low-Resource Speech Recognition

Jui-Yang Hsu, Yuan-Jui Chen, Hung-yi Lee

PDF

TL;DR

This paper introduces MetaASR, a meta-learning approach for low-resource speech recognition that adapts quickly to new languages by leveraging pretraining on multiple languages, outperforming existing methods.

Contribution

The paper applies model-agnostic meta learning (MAML) to low-resource speech recognition, demonstrating improved adaptation to unseen languages over traditional pretraining methods.

Findings

01

MetaASR outperforms state-of-the-art multitask pretraining on all target languages.

02

Meta-learning enables faster adaptation to new languages with limited data.

03

The approach opens new research directions for applying meta learning in speech applications.

Abstract

In this paper, we proposed to apply meta learning approach for low-resource automatic speech recognition (ASR). We formulated ASR for different languages as different tasks, and meta-learned the initialization parameters from many pretraining languages to achieve fast adaptation on unseen target language, via recently proposed model-agnostic meta learning algorithm (MAML). We evaluated the proposed approach using six languages as pretraining tasks and four languages as target tasks. Preliminary results showed that the proposed method, MetaASR, significantly outperforms the state-of-the-art multitask pretraining approach on all target languages with different combinations of pretraining languages. In addition, since MAML's model-agnostic property, this paper also opens new research direction of applying meta learning to more speech-related applications.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.