Fine-Grained Model Merging via Modular Expert Recombination
Haiyun Qiu, Xingyu Wu, Liang Feng, Kay Chen Tan

TL;DR
MERGE introduces a modular, input-aware model merging technique that enhances reusability and efficiency by recombining task-specific components, outperforming existing methods across diverse scenarios.
Contribution
The paper presents MERGE, a novel method for component-wise model merging using bi-objective optimization and a modular expert library for dynamic, input-specific model assembly.
Findings
Outperforms strong baselines across various tasks and models.
Enables efficient, input-specific model recombination at inference.
Achieves a balance between performance and storage efficiency.
Abstract
Model merging constructs versatile models by integrating task-specific models without requiring labeled data or expensive joint retraining. Although recent methods improve adaptability to heterogeneous tasks by generating customized merged models for each instance, they face two critical limitations. First, the instance-specific merged models lack reusability, restricting the exploitation of high-quality merging configurations and efficient batch inference. Second, these methods treat each task-specific model as a monolithic whole, overlooking the diverse mergeability of homologous components such as attention and multilayer perceptron layers, and the differing merging sensitivities across components. To address these limitations, we propose MERGE (\underline{M}odular \underline{E}xpert \underline{R}ecombination for fine-\underline{G}rained m\underline{E}rging), a method that enables…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Domain Adaptation and Few-Shot Learning · Advanced Neural Network Applications
