MAGIC: Achieving Superior Model Merging via Magnitude Calibration
Yayuan Li, Jian Zhang, Jintao Guo, Zihan Cheng, Lei Qi, Yinghuan Shi, Yang Gao

TL;DR
This paper introduces MAGIC, a plug-and-play framework that calibrates feature and weight magnitudes to improve model merging, leading to significant performance gains across vision and NLP tasks without extra training.
Contribution
MAGIC is the first method to focus on magnitude calibration in model merging, addressing a neglected aspect and enhancing performance without additional training.
Findings
Boosts performance by +4.3% on 8 vision datasets
Achieves +8.0% improvement on Llama NLP model
Effective across diverse vision and NLP tasks
Abstract
The proliferation of pre-trained models has given rise to a wide array of specialised, fine-tuned models. Model merging aims to merge the distinct capabilities of these specialised models into a unified model, requiring minimal or even no additional training. A core objective of model merging is to ensure the merged model retains the behavioural characteristics of the specialised models, typically achieved through feature alignment. We identify that features consist of two critical components: direction and magnitude. Prior research has predominantly focused on directional alignment, while the influence of magnitude remains largely neglected, despite its pronounced vulnerability to perturbations introduced by common merging operations (e.g., parameter fusion and sparsification). Such perturbations to magnitude inevitably lead to feature deviations in the merged model from the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Adversarial Robustness in Machine Learning
