Mixture of LoRA Experts for Low-Resourced Multi-Accent Automatic Speech Recognition
Rapha\"el Bagat, Irina Illina, Emmanuel Vincent

TL;DR
This paper introduces MAS-LoRA, a novel fine-tuning approach using a mixture of accent-specific LoRA experts to enhance multi-accent ASR robustness, especially in low-resource scenarios, outperforming traditional methods.
Contribution
It proposes the first mixture of LoRA experts for non-native multi-accent ASR, enabling accent adaptation without re-fine-tuning at inference.
Findings
Significant WER improvements over regular LoRA and full fine-tuning.
Better performance when the accent is known at inference.
Less catastrophic forgetting compared to other fine-tuning methods.
Abstract
We aim to improve the robustness of Automatic Speech Recognition (ASR) systems against non-native speech, particularly in low-resourced multi-accent settings. We introduce Mixture of Accent-Specific LoRAs (MAS-LoRA), a fine-tuning method that leverages a mixture of Low-Rank Adaptation (LoRA) experts, each specialized in a specific accent. This method can be used when the accent is known or unknown at inference time, without the need to fine-tune the model again. Our experiments, conducted using Whisper on the L2-ARCTIC corpus, demonstrate significant improvements in Word Error Rate compared to regular LoRA and full fine-tuning when the accent is unknown. When the accent is known, the results further improve. Furthermore, MAS-LoRA shows less catastrophic forgetting than the other fine-tuning methods. To the best of our knowledge, this is the first use of a mixture of LoRA experts for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and Audio Processing
