Training-free LLM Merging for Multi-task Learning

Zichuan Fu; Xian Wu; Yejing Wang; Wanyu Wang; Shanshan Ye; Hongzhi Yin; Yi Chang; Yefeng Zheng; Xiangyu Zhao

arXiv:2506.12379·cs.CL·June 17, 2025

Training-free LLM Merging for Multi-task Learning

Zichuan Fu, Xian Wu, Yejing Wang, Wanyu Wang, Shanshan Ye, Hongzhi Yin, Yi Chang, Yefeng Zheng, Xiangyu Zhao

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces Hi-Merging, a training-free hierarchical method to unify specialized LLMs into a multi-task capable model, outperforming existing merging techniques without additional training.

Contribution

Proposes Hi-Merging, a novel training-free approach for combining specialized LLMs into a multi-task model using pruning and scaling guided by contribution analysis.

Findings

01

Hi-Merging outperforms existing merging methods.

02

It surpasses fine-tuned models on multiple tasks.

03

Effective in both Chinese and English NLP tasks.

Abstract

Large Language Models (LLMs) have demonstrated exceptional capabilities across diverse natural language processing (NLP) tasks. The release of open-source LLMs like LLaMA and Qwen has triggered the development of numerous fine-tuned models tailored for various tasks and languages. In this paper, we explore an important question: is it possible to combine these specialized models to create a unified model with multi-task capabilities. We introduces Hierarchical Iterative Merging (Hi-Merging), a training-free method for unifying different specialized LLMs into a single model. Specifically, Hi-Merging employs model-wise and layer-wise pruning and scaling, guided by contribution analysis, to mitigate parameter conflicts. Extensive experiments on multiple-choice and question-answering tasks in both Chinese and English validate Hi-Merging's ability for multi-task learning. The results…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

applied-machine-learning-lab/hi-merging
pytorchOfficial

Videos

Training-free LLM Merging for Multi-task Learning· underline

Taxonomy

TopicsNeural Networks and Applications · Fuzzy Logic and Control Systems

MethodsLLaMA · Pruning