Magic dust for cross-lingual adaptation of monolingual wav2vec-2.0

Sameer Khurana; Antoine Laurent; James Glass

arXiv:2110.03560·cs.CL·May 18, 2022

Magic dust for cross-lingual adaptation of monolingual wav2vec-2.0

Sameer Khurana, Antoine Laurent, James Glass

PDF

TL;DR

This paper introduces a simple cross-lingual transfer learning method using Dropout Uncertainty-Driven Self-Training to adapt monolingual wav2vec-2.0 models for low-resource language ASR, achieving performance comparable to large multilingual models.

Contribution

It presents a novel adaptation approach that enhances monolingual wav2vec-2.0 models for cross-lingual ASR, matching the performance of extensive multilingual models.

Findings

01

Monolingual wav2vec-2.0 models are effective few-shot learners.

02

DUST improves ASR performance with unlabeled data.

03

Adapted models match multilingual XLSR performance.

Abstract

We propose a simple and effective cross-lingual transfer learning method to adapt monolingual wav2vec-2.0 models for Automatic Speech Recognition (ASR) in resource-scarce languages. We show that a monolingual wav2vec-2.0 is a good few-shot ASR learner in several languages. We improve its performance further via several iterations of Dropout Uncertainty-Driven Self-Training (DUST) by using a moderate-sized unlabeled speech dataset in the target language. A key finding of this work is that the adapted monolingual wav2vec-2.0 achieves similar performance as the topline multilingual XLSR model, which is trained on fifty-three languages, on the target language ASR task.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsXLSR · Dropout