Fine, I'll Merge It Myself: A Multi-Fidelity Framework for Automated Model Merging
Guinan Su, Jonas Geiping

TL;DR
This paper introduces an automated, multi-fidelity framework for model merging that efficiently explores merging strategies to enhance large language models' capabilities without retraining.
Contribution
It presents a novel automated search framework with new merging strategies, reducing human effort and computational costs in model merging.
Findings
Automated search finds merges that improve single-objective performance.
Merges optimize multi-objective performance across tasks.
Effective merges achieved with less than 500 search steps.
Abstract
Reasoning capabilities represent a critical frontier for large language models (LLMs), but developing them requires extensive proprietary datasets and computational resources. One way to efficiently supplement capabilities with is by model merging, which offers a promising alternative by combining multiple models without retraining. However, current merging approaches rely on manually-designed strategies for merging hyperparameters, limiting the exploration of potential model combinations and requiring significant human effort. We propose an Automated Model Merging Framework that enables fine-grained exploration of merging strategies while reducing costs through multi-fidelity approximations. We support both single and multi-objective optimization and introduce two novel search spaces: layerwise fusion (LFS) and depth-wise integration (DIS). Evaluating across a number of benchmarks, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBusiness Process Modeling and Analysis · Model-Driven Software Engineering Techniques · Service-Oriented Architecture and Web Services
