A Parameter-efficient Language Extension Framework for Multilingual ASR
Wei Liu, Jingyong Hou, Dong Yang, Muyong Cao, Tan Lee

TL;DR
This paper introduces PELE, a parameter-efficient framework for extending multilingual speech recognition models to new languages, effectively addressing catastrophic forgetting and demonstrating superior performance with minimal additional parameters.
Contribution
The paper proposes a novel architecture-based framework, PELE, that enables efficient language extension in MASR models using parameter-efficient fine-tuning modules, outperforming traditional methods.
Findings
PELE effectively incorporates new languages with minimal performance loss.
Adapter-based PEFT modules outperform weight or input feature-based methods.
Experiments on 5 languages show superior results over joint learning approaches.
Abstract
Covering all languages with a multilingual speech recognition model (MASR) is very difficult. Performing language extension on top of an existing MASR is a desirable choice. In this study, the MASR continual learning problem is probabilistically decomposed into language identity prediction (LP) and cross-lingual adaptation (XLA) sub-problems. Based on this, we propose an architecture-based framework for language extension that can fundamentally solve catastrophic forgetting, debudded as PELE. PELE is designed to be parameter-efficient, incrementally incorporating an add-on module to adapt to a new language. Specifically, different parameter-efficient fine-tuning (PEFT) modules and their variants are explored as potential candidates to perform XLA. Experiments are carried out on 5 new languages with a wide range of low-resourced data sizes. The best-performing PEFT candidate can achieve…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and dialogue systems · Speech Recognition and Synthesis · Service-Oriented Architecture and Web Services
MethodsAdapter
