Learning ASR pathways: A sparse multilingual ASR model

Mu Yang; Andros Tjandra; Chunxi Liu; David Zhang; Duc Le; Ozlem; Kalinli

arXiv:2209.05735·eess.AS·October 2, 2023·1 cites

Learning ASR pathways: A sparse multilingual ASR model

Mu Yang, Andros Tjandra, Chunxi Liu, David Zhang, Duc Le, Ozlem, Kalinli

PDF

Open Access

TL;DR

This paper introduces ASR pathways, a sparse multilingual speech recognition model that learns language-specific sub-networks, improving performance especially for low-resource languages through shared parameters and knowledge transfer.

Contribution

The paper proposes a novel algorithm for learning language-specific pathways in sparse multilingual ASR models, enhancing performance and knowledge sharing.

Findings

01

Outperforms dense and language-agnostic pruned models

02

Improves low-resource language recognition

03

Enables knowledge transfer via shared parameters

Abstract

Neural network pruning compresses automatic speech recognition (ASR) models effectively. However, in multilingual ASR, language-agnostic pruning may lead to severe performance drops on some languages because language-agnostic pruning masks may not fit all languages and discard important language-specific parameters. In this work, we present ASR pathways, a sparse multilingual ASR model that activates language-specific sub-networks ("pathways"), such that the parameters for each language are learned explicitly. With the overlapping sub-networks, the shared parameters can also enable knowledge transfer for lower-resource languages via joint multilingual training. We propose a novel algorithm to learn ASR pathways, and evaluate the proposed method on 4 languages with a streaming RNN-T model. Our proposed ASR pathways outperform both dense models and a language-agnostically pruned model,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Music and Audio Processing · Speech and Audio Processing

MethodsPruning