Dynamic ASR Pathways: An Adaptive Masking Approach Towards Efficient Pruning of A Multilingual ASR Model

Jiamin Xie; Ke Li; Jinxi Guo; Andros Tjandra; Yuan Shangguan; Leda Sari; Chunyang Wu; Junteng Jia; Jay Mahadeokar; Ozlem Kalinli

arXiv:2309.13018·eess.AS·June 19, 2025

Dynamic ASR Pathways: An Adaptive Masking Approach Towards Efficient Pruning of A Multilingual ASR Model

Jiamin Xie, Ke Li, Jinxi Guo, Andros Tjandra, Yuan Shangguan, Leda Sari, Chunyang Wu, Junteng Jia, Jay Mahadeokar, Ozlem Kalinli

PDF

Open Access

TL;DR

This paper introduces Dynamic ASR Pathways, an adaptive masking method for efficient pruning of multilingual speech recognition models, enabling better performance and reduced need for language-specific pruning.

Contribution

It proposes a novel adaptive masking approach that dynamically adjusts sub-networks, improving pruning efficiency and performance in multilingual ASR models.

Findings

01

Outperforms existing pruning methods for sparse monolingual models.

02

Jointly discovers and trains better sub-networks in multilingual models.

03

Reduces the need for language-specific pruning processes.

Abstract

Neural network pruning offers an effective method for compressing a multilingual automatic speech recognition (ASR) model with minimal performance loss. However, it entails several rounds of pruning and re-training needed to be run for each language. In this work, we propose the use of an adaptive masking approach in two scenarios for pruning a multilingual ASR model efficiently, each resulting in sparse monolingual models or a sparse multilingual model (named as Dynamic ASR Pathways). Our approach dynamically adapts the sub-network, avoiding premature decisions about a fixed sub-network structure. We show that our approach outperforms existing pruning methods when targeting sparse monolingual models. Further, we illustrate that Dynamic ASR Pathways jointly discovers and trains better sub-networks (pathways) of a single multilingual model by adapting from different sub-network…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and dialogue systems · Speech and Audio Processing

MethodsL1 Regularization · Adaptive Masking · Pruning