EMR-Merging: Tuning-Free High-Performance Model Merging
Chenyu Huang, Peng Ye, Tao Chen, Tong He, Xiangyu Yue, Wanli Ouyang

TL;DR
EMR-Merging is a tuning-free model merging method that creates a unified multi-task model by electing a base and lightweight modulators, achieving high performance without additional training or data.
Contribution
It introduces EMR-Merging, a novel tuning-free approach that aligns models using lightweight modulators, outperforming existing methods across various model types and settings.
Findings
Outperforms existing merging methods in multiple settings
Works with vision, NLP, PEFT, and multi-modal models
Requires no additional data or training
Abstract
The success of pretrain-finetune paradigm brings about the release of numerous model weights. In this case, merging models finetuned on different tasks to enable a single model with multi-task capabilities is gaining increasing attention for its practicability. Existing model merging methods usually suffer from (1) significant performance degradation or (2) requiring tuning by additional data or training. In this paper, we rethink and analyze the existing model merging paradigm. We discover that using a single model's weights can hardly simulate all the models' performance. To tackle this issue, we propose Elect, Mask & Rescale-Merging (EMR-Merging). We first (a) elect a unified model from all the model weights and then (b) generate extremely lightweight task-specific modulators, including masks and rescalers, to align the direction and magnitude between the unified model and each…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsSpeech Recognition and Synthesis · Natural Language Processing Techniques
MethodsALIGN
