HDMoLE: Mixture of LoRA Experts with Hierarchical Routing and Dynamic Thresholds for Fine-Tuning LLM-based ASR Models
Bingshen Mu, Kun Wei, Qijie Shao, Yong Xu, Lei Xie

TL;DR
HDMoLE is a novel parameter-efficient fine-tuning method for LLM-based ASR models that uses hierarchical routing and dynamic thresholds to adapt to multiple accents with minimal performance loss and reduced computational costs.
Contribution
It introduces a hierarchical routing and dynamic threshold mechanism combining LoRA and MoE for multi-domain ASR fine-tuning, reducing parameters while maintaining performance.
Findings
Achieves similar performance to full fine-tuning with only 9.6% of parameters.
Effectively adapts to multi-accent and Mandarin datasets.
Minimal degradation in general domain performance.
Abstract
Recent advancements in integrating Large Language Models (LLM) with automatic speech recognition (ASR) have performed remarkably in general domains. While supervised fine-tuning (SFT) of all model parameters is often employed to adapt pre-trained LLM-based ASR models to specific domains, it imposes high computational costs and notably reduces their performance in general domains. In this paper, we propose a novel parameter-efficient multi-domain fine-tuning method for adapting pre-trained LLM-based ASR models to multi-accent domains without catastrophic forgetting named \textit{HDMoLE}, which leverages hierarchical routing and dynamic thresholds based on combining low-rank adaptation (LoRA) with the mixer of experts (MoE) and can be generalized to any linear layer. Hierarchical routing establishes a clear correspondence between LoRA experts and accent domains, improving cross-domain…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Seismology and Earthquake Studies · Neural Networks and Applications
