Mixture of LoRA Experts for Low-Resourced Multi-Accent Automatic Speech Recognition

Rapha\"el Bagat; Irina Illina; Emmanuel Vincent

arXiv:2505.20006·cs.CL·October 3, 2025

Mixture of LoRA Experts for Low-Resourced Multi-Accent Automatic Speech Recognition

Rapha\"el Bagat, Irina Illina, Emmanuel Vincent

PDF

Open Access

TL;DR

This paper introduces MAS-LoRA, a novel fine-tuning approach using a mixture of accent-specific LoRA experts to enhance multi-accent ASR robustness, especially in low-resource scenarios, outperforming traditional methods.

Contribution

It proposes the first mixture of LoRA experts for non-native multi-accent ASR, enabling accent adaptation without re-fine-tuning at inference.

Findings

01

Significant WER improvements over regular LoRA and full fine-tuning.

02

Better performance when the accent is known at inference.

03

Less catastrophic forgetting compared to other fine-tuning methods.

Abstract

We aim to improve the robustness of Automatic Speech Recognition (ASR) systems against non-native speech, particularly in low-resourced multi-accent settings. We introduce Mixture of Accent-Specific LoRAs (MAS-LoRA), a fine-tuning method that leverages a mixture of Low-Rank Adaptation (LoRA) experts, each specialized in a specific accent. This method can be used when the accent is known or unknown at inference time, without the need to fine-tune the model again. Our experiments, conducted using Whisper on the L2-ARCTIC corpus, demonstrate significant improvements in Word Error Rate compared to regular LoRA and full fine-tuning when the accent is unknown. When the accent is known, the results further improve. Furthermore, MAS-LoRA shows less catastrophic forgetting than the other fine-tuning methods. To the best of our knowledge, this is the first use of a mixture of LoRA experts for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing