Towards Rehearsal-Free Multilingual ASR: A LoRA-based Case Study on Whisper
Tianyi Xu, Kaixun Huang, Pengcheng Guo, Yu Zhou, Longtao Huang, Hui, Xue, Lei Xie

TL;DR
This paper explores LoRA-based strategies to adapt multilingual speech models like Whisper to new languages without retraining from scratch, aiming to prevent forgetting and improve efficiency.
Contribution
It introduces a novel approach using original LoRA parameters and a learnable rank coefficient to enhance multilingual model adaptation while preserving previous performance.
Findings
Better adaptation results with fewer parameters
Effective mitigation of catastrophic forgetting
Improved efficiency in multilingual speech recognition
Abstract
Pre-trained multilingual speech foundation models, like Whisper, have shown impressive performance across different languages. However, adapting these models to new or specific languages is computationally extensive and faces catastrophic forgetting problems. Addressing these issues, our study investigates strategies to enhance the model on new languages in the absence of original training data, while also preserving the established performance on the original languages. Specifically, we first compare various LoRA-based methods to find out their vulnerability to forgetting. To mitigate this issue, we propose to leverage the LoRA parameters from the original model for approximate orthogonal gradient descent on the new samples. Additionally, we also introduce a learnable rank coefficient to allocate trainable parameters for more efficient training. Our experiments with a Chinese Whisper…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection
