Twin-Merging: Dynamic Integration of Modular Expertise in Model Merging
Zhenyi Lu, Chenghao Fan, Wei Wei, Xiaoye Qu, Dangyang Chen, Yu Cheng

TL;DR
Twin-Merging introduces a dynamic, modular approach to model merging that separates shared and exclusive knowledge, significantly improving performance and adaptability across diverse language and vision tasks.
Contribution
The paper proposes Twin-Merging, a novel method that modularizes and dynamically merges shared and exclusive knowledge, reducing interference and enhancing efficiency in model merging.
Findings
Achieves 28.34% average improvement in discriminative tasks.
Surpasses fine-tuned upper bounds on generative tasks.
Effectively handles heterogeneous data in model merging.
Abstract
In the era of large language models, model merging is a promising way to combine multiple task-specific models into a single multitask model without extra training. However, two challenges remain: (a) interference between different models and (b) heterogeneous data during testing. Traditional model merging methods often show significant performance gaps compared to fine-tuned models due to these issues. Additionally, a one-size-fits-all model lacks flexibility for diverse test data, leading to performance degradation. We show that both shared and exclusive task-specific knowledge are crucial for merging performance, but directly merging exclusive knowledge hinders overall performance. In view of this, we propose Twin-Merging, a method that encompasses two principal stages: (1) modularizing knowledge into shared and exclusive components, with compression to reduce redundancy and enhance…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModel-Driven Software Engineering Techniques · Business Process Modeling and Analysis · Multi-Agent Systems and Negotiation
