Demystifying Mergeability: Interpretable Properties to Predict Model Merging Success

Luca Zhou; Bo Zhao; Rose Yu; Emanuele Rodol\`a

arXiv:2601.22285·cs.LG·May 20, 2026

Demystifying Mergeability: Interpretable Properties to Predict Model Merging Success

Luca Zhou, Bo Zhao, Rose Yu, Emanuele Rodol\`a

PDF

TL;DR

This paper investigates the factors influencing the success of model merging, revealing that gradient alignment metrics are key indicators across various methods and architectures.

Contribution

It introduces an architecture-agnostic framework using interpretable metrics to predict and understand model mergeability, highlighting the importance of gradient alignment.

Findings

01

Gradient alignment metrics are the most fundamental signals of compatibility.

02

Success drivers vary across architectures and merging methods.

03

Certain methods like TIES have distinct mergeability 'fingerprints'.

Abstract

Model merging combines knowledge from separately fine-tuned models, yet the factors driving its success remain poorly understood. While recent work treats mergeability as an intrinsic property of the models, we show with an architecture-agnostic framework that it fundamentally depends on both the merging method and the partner tasks. Using L1-regularized linear optimization over a set of interpretable pairwise metrics (e.g., gradient L_2 distance), we uncover properties correlating with post-merge normalized accuracy across five merging methods. We find architecture- and method-specific variation in success drivers (64.0% average top-5 metric overlap; 79.3% sign agreement), with certain methods, notably TIES, exhibiting distinct ``fingerprints'' that diverge from the broader consensus. Crucially, however, gradient alignment metrics consistently emerge as the most fundamental signals of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.