SE-Merging: A Self-Enhanced Approach for Dynamic Model Merging

Zijun Chen; Zhanpeng Zhou; Bo Zhang; Weinan Zhang; Xi Sun; Junchi Yan

arXiv:2506.18135·cs.AI·June 24, 2025

SE-Merging: A Self-Enhanced Approach for Dynamic Model Merging

Zijun Chen, Zhanpeng Zhou, Bo Zhang, Weinan Zhang, Xi Sun, Junchi Yan

PDF

TL;DR

SE-Merging is a novel framework that enhances dynamic model merging by leveraging task differentiation and adaptive rescaling, achieving improved multi-task performance without extra training.

Contribution

It introduces a self-enhanced merging method that dynamically identifies tasks and adjusts merging coefficients, advancing understanding and capabilities of model merging.

Findings

01

Significant performance improvements over existing merging methods

02

Effective task differentiation and adaptive rescaling in merging process

03

Compatibility with various existing model merging techniques

Abstract

Model merging has gained increasing attention due to its intriguing property: interpolating the parameters of different task-specific fine-tuned models leads to multi-task abilities. However, despite its empirical success, the underlying mechanisms of model merging remain poorly understood. In this work, we delve into the mechanism behind model merging from a representation perspective. Our analysis reveals that model merging achieves multi-task abilities through two key capabilities: i) distinguishing samples from different tasks, and ii) adapting to the corresponding expert model for each sample. These two capabilities allow the merged model to retain task-specific expertise, enabling efficient multi-task adaptation. Building on these insights, we propose \texttt{SE-Merging}, a self-enhanced model merging framework that leverages these two characteristics to dynamically identify the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.