Merging Smarter, Generalizing Better: Enhancing Model Merging on OOD Data

Bingjie Zhang; Hongkang Li; Changlong Shi; Guowei Rong; He Zhao; Dongsheng Wang; Dandan Guo; Meng Wang

arXiv:2506.09093·cs.LG·June 16, 2025

Merging Smarter, Generalizing Better: Enhancing Model Merging on OOD Data

Bingjie Zhang, Hongkang Li, Changlong Shi, Guowei Rong, He Zhao, Dongsheng Wang, Dandan Guo, Meng Wang

PDF

Open Access

TL;DR

This paper introduces LwPTV, a layer-wise pruning method for model merging that improves out-of-domain generalization in multi-task learning without sacrificing in-domain performance.

Contribution

The paper proposes a novel saliency-based pruning approach, LwPTV, that enhances model merging for better out-of-domain generalization in multi-task learning.

Findings

01

Significant improvements in OOD task performance.

02

Maintains strong in-domain task performance.

03

Compatible with existing model merging methods.

Abstract

Multi-task learning (MTL) concurrently trains a model on diverse task datasets to exploit common features, thereby improving overall performance across the tasks. Recent studies have dedicated efforts to merging multiple independent model parameters into a unified model for MTL, thus circumventing the need for training data and expanding the scope of applicable scenarios of MTL. However, current approaches to model merging predominantly concentrate on enhancing performance within in-domain (ID) datasets, often overlooking their efficacy on out-of-domain (OOD) datasets. In this work, we proposed LwPTV (Layer-wise Pruning Task Vector) by building a saliency score, measuring the redundancy of parameters in task vectors. Designed in this way ours can achieve mask vector for each task and thus perform layer-wise pruning on the task vectors, only keeping the pre-trained model parameters at…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Multimodal Machine Learning Applications

MethodsPruning