Cross-lingual Transfer for Speech Processing using Acoustic Language   Similarity

Peter Wu; Jiatong Shi; Yifan Zhong; Shinji Watanabe; Alan W Black

arXiv:2111.01326·eess.AS·November 3, 2021

Cross-lingual Transfer for Speech Processing using Acoustic Language Similarity

Peter Wu, Jiatong Shi, Yifan Zhong, Shinji Watanabe, Alan W Black

PDF

Open Access 1 Repo

TL;DR

This paper introduces a language similarity method to identify effective cross-lingual transfer pairs, significantly advancing speech processing support for hundreds of low-resource languages across various tasks.

Contribution

It presents a scalable acoustic language similarity approach that improves cross-lingual transfer for speech tasks in many low-resource languages.

Findings

01

Effective in language family classification

02

Enhances speech recognition accuracy

03

Improves speech synthesis quality

Abstract

Speech processing systems currently do not support the vast majority of languages, in part due to the lack of data in low-resource languages. Cross-lingual transfer offers a compelling way to help bridge this digital divide by incorporating high-resource data into low-resource systems. Current cross-lingual algorithms have shown success in text-based tasks and speech-related tasks over some low-resource languages. However, scaling up speech systems to support hundreds of low-resource languages remains unsolved. To help bridge this gap, we propose a language similarity approach that can efficiently identify acoustic cross-lingual transfer pairs across hundreds of languages. We demonstrate the effectiveness of our approach in language family classification, speech recognition, and speech synthesis tasks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

peter-yh-wu/cross-lingual
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Music and Audio Processing · Natural Language Processing Techniques