Exploring the potential and limitations of Model Merging for Multi-Domain Adaptation in ASR

Carlos Carvalho; Francisco Teixeira; Thomas Rolland; Alberto Abad

arXiv:2603.05354·cs.CL·March 6, 2026

Exploring the potential and limitations of Model Merging for Multi-Domain Adaptation in ASR

Carlos Carvalho, Francisco Teixeira, Thomas Rolland, Alberto Abad

PDF

Open Access

TL;DR

This paper investigates model merging as a scalable method for multi-domain automatic speech recognition, benchmarking various algorithms, proposing a new method BoostedTSV-M, and demonstrating its advantages over traditional fine-tuning.

Contribution

The study benchmarks 11 model merging algorithms for multi-domain ASR, introduces BoostedTSV-M to improve merging stability, and shows it outperforms fine-tuning in European Portuguese tasks.

Findings

01

BoostedTSV-M mitigates rank collapse and enhances stability.

02

Model merging outperforms full fine-tuning in European Portuguese.

03

The approach maintains out-of-distribution generalization.

Abstract

Model merging is a scalable alternative to multi-task training that combines the capabilities of multiple specialised models into a single model. This is particularly attractive for large speech foundation models, which are typically adapted through domain-specific fine-tuning, resulting in multiple customised checkpoints, for which repeating full fine-tuning when new data becomes available is computationally prohibitive. In this work, we study model merging for multi-domain ASR and benchmark 11 merging algorithms for 10 European Portuguese domains, evaluating in-domain accuracy, robustness under distribution shift, as well as English and multilingual performance. We further propose BoostedTSV-M, a new merging algorithm based on TSV-M that mitigates rank collapse via singular-value boosting and improves numerical stability. Overall, our approach outperforms full fine-tuning on European…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Speech Recognition and Synthesis · Topic Modeling