Merge and Conquer: Instructing Multilingual Models by Adding Target Language Weights

Eneko Valero; Maria Ribalta i Albado; Oscar Sainz; Naiara Perez; German Rigau

arXiv:2603.28263·cs.CL·March 31, 2026

Merge and Conquer: Instructing Multilingual Models by Adding Target Language Weights

Eneko Valero, Maria Ribalta i Albado, Oscar Sainz, Naiara Perez, German Rigau

PDF

TL;DR

This paper investigates merging instruction-tuned multilingual models with language-specific base models to improve low-resource language performance efficiently, reducing the need for extensive retraining.

Contribution

It introduces a systematic exploration of model merging as a lightweight alternative for adapting multilingual models to low-resource languages, demonstrating its effectiveness across several Iberian languages.

Findings

01

Merging enables instruction following in new languages without additional fine-tuning.

02

The approach supports multilingual capabilities by combining multiple language-specific models.

03

Model merging achieves competitive performance with significantly lower computational costs.

Abstract

Large Language Models (LLMs) remain heavily centered on English, with limited performance in low-resource languages. Existing adaptation approaches, such as continual pre-training, demand significant computational resources. In the case of instructed models, high-quality instruction data is also required, both of which are often inaccessible for low-resource language communities. Under these constraints, model merging offers a lightweight alternative, but its potential in low-resource contexts has not been systematically explored. In this work, we explore whether it is possible to transfer language knowledge to an instruction-tuned LLM by merging it with a language-specific base model, thereby eliminating the need of language-specific instructions and repeated fine-tuning processes whenever stronger instructed variants become available. Through experiments covering four Iberian…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.