M-Loss: Quantifying Model Merging Compatibility with Limited Unlabeled Data
Tiantong Wang, Yiyang Duan, Haoyu Chen, Tiantong Wu, Wei Yang Bryan Lim

TL;DR
This paper introduces M-Loss, a new metric to evaluate and improve the compatibility of model merging using limited unlabeled data, bridging the gap between merging and ensembling techniques.
Contribution
The paper proposes M-Loss, a novel evaluation metric that quantifies model merging compatibility and guides effective merging strategies with minimal data.
Findings
M-Loss improves the alignment between merged models and ensembling.
Incorporating M-Loss enhances model merging effectiveness.
The approach is scalable and requires limited unlabeled data.
Abstract
Training of large-scale models is both computationally intensive and often constrained by the availability of labeled data. Model merging offers a compelling alternative by directly integrating the weights of multiple source models without requiring additional data or extensive training. However, conventional model merging techniques, such as parameter averaging, often suffer from the unintended combination of non-generalizable features, especially when source models exhibit significant weight disparities. Comparatively, model ensembling generally provides more stable and superior performance that aggregates multiple models by averaging outputs. However, it incurs higher inference costs and increased storage requirements. While previous studies experimentally showed the similarities between model merging and ensembling, theoretical evidence and evaluation metrics remain lacking. To…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsTopic Modeling · Domain Adaptation and Few-Shot Learning · Machine Learning and Data Classification
