EMR-Merging: Tuning-Free High-Performance Model Merging

Chenyu Huang; Peng Ye; Tao Chen; Tong He; Xiangyu Yue; Wanli Ouyang

arXiv:2405.17461·cs.LG·September 30, 2024

EMR-Merging: Tuning-Free High-Performance Model Merging

Chenyu Huang, Peng Ye, Tao Chen, Tong He, Xiangyu Yue, Wanli Ouyang

PDF

Open Access 1 Repo 1 Video

TL;DR

EMR-Merging is a tuning-free model merging method that creates a unified multi-task model by electing a base and lightweight modulators, achieving high performance without additional training or data.

Contribution

It introduces EMR-Merging, a novel tuning-free approach that aligns models using lightweight modulators, outperforming existing methods across various model types and settings.

Findings

01

Outperforms existing merging methods in multiple settings

02

Works with vision, NLP, PEFT, and multi-modal models

03

Requires no additional data or training

Abstract

The success of pretrain-finetune paradigm brings about the release of numerous model weights. In this case, merging models finetuned on different tasks to enable a single model with multi-task capabilities is gaining increasing attention for its practicability. Existing model merging methods usually suffer from (1) significant performance degradation or (2) requiring tuning by additional data or training. In this paper, we rethink and analyze the existing model merging paradigm. We discover that using a single model's weights can hardly simulate all the models' performance. To tackle this issue, we propose Elect, Mask & Rescale-Merging (EMR-Merging). We first (a) elect a unified model from all the model weights and then (b) generate extremely lightweight task-specific modulators, including masks and rescalers, to align the direction and magnitude between the unified model and each…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

harveyhuang18/emr_merging
pytorchOfficial

Videos

EMR-Merging: Tuning-Free High-Performance Model Merging· slideslive

Taxonomy

TopicsSpeech Recognition and Synthesis · Natural Language Processing Techniques

MethodsALIGN