Continual Learning with Embedding Layer Surgery and Task-wise Beam Search using Whisper
Chin Yuen Kwok, Jia Qi Yip, Eng Siong Chng

TL;DR
This paper introduces a novel continual learning approach for multilingual ASR that uses embedding layer surgery and task-wise beam search to effectively add new languages while mitigating catastrophic forgetting.
Contribution
It proposes Embedding Layer Surgery and Task-wise Beam Search, addressing embedding adaptation and self-correction, to improve multilingual ASR performance in continual learning settings.
Findings
Reduces Average WER from 14.2% to 11.9% on pre-trained languages.
Maintains performance on unseen languages.
Outperforms Experience Replay in experiments.
Abstract
Current Multilingual ASR models only support a fraction of the world's languages. Continual Learning (CL) aims to tackle this problem by adding new languages to pre-trained models while avoiding the loss of performance on existing languages, also known as Catastrophic Forgetting (CF). However, existing CL methods overlook the adaptation of the token embedding lookup table at the decoder, despite its significant contribution to CF. We propose Embedding Layer Surgery where separate copies of the token embeddings are created for each new languages, and one of the copies is selected to replace the old languages embeddings when transcribing the corresponding new language. Unfortunately, this approach means LID errors also cause incorrect ASR embedding selection. Our Task-wise Beam Search allows self-correction for such mistakes. By adapting Whisper to 10 hours of data for each of 10 unseen…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques · Educational Technology and Assessment
MethodsExperience Replay
