Fine, I'll Merge It Myself: A Multi-Fidelity Framework for Automated Model Merging

Guinan Su; Jonas Geiping

arXiv:2502.04030·cs.AI·June 26, 2025

Fine, I'll Merge It Myself: A Multi-Fidelity Framework for Automated Model Merging

Guinan Su, Jonas Geiping

PDF

Open Access 1 Repo

TL;DR

This paper introduces an automated, multi-fidelity framework for model merging that efficiently explores merging strategies to enhance large language models' capabilities without retraining.

Contribution

It presents a novel automated search framework with new merging strategies, reducing human effort and computational costs in model merging.

Findings

01

Automated search finds merges that improve single-objective performance.

02

Merges optimize multi-objective performance across tasks.

03

Effective merges achieved with less than 500 search steps.

Abstract

Reasoning capabilities represent a critical frontier for large language models (LLMs), but developing them requires extensive proprietary datasets and computational resources. One way to efficiently supplement capabilities with is by model merging, which offers a promising alternative by combining multiple models without retraining. However, current merging approaches rely on manually-designed strategies for merging hyperparameters, limiting the exploration of potential model combinations and requiring significant human effort. We propose an Automated Model Merging Framework that enables fine-grained exploration of merging strategies while reducing costs through multi-fidelity approximations. We support both single and multi-objective optimization and introduce two novel search spaces: layerwise fusion (LFS) and depth-wise integration (DIS). Evaluating across a number of benchmarks, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

guinan-su/auto-merge-llm
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBusiness Process Modeling and Analysis · Model-Driven Software Engineering Techniques · Service-Oriented Architecture and Web Services