MERGE$^3$: Efficient Evolutionary Merging on Consumer-grade GPUs
Tommaso Mencattini, Adrian Robert Minut, Donato Crisostomi, Andrea Santilli, Emanuele Rodol\`a

TL;DR
MERGE$^3$ is a novel framework that enables efficient evolutionary model merging on consumer-grade GPUs by significantly reducing computational costs while maintaining high performance, facilitating broader access to multi-task model development.
Contribution
It introduces a GPU-efficient evolutionary merging method using IRT-based estimators, reducing costs by 50x and enabling high-quality multilingual model merging.
Findings
Reduces fitness computation costs 50 times
Enables effective multilingual and cross-lingual merging
Provides theoretical guarantees and open-source tools
Abstract
Evolutionary model merging enables the creation of high-performing multi-task models but remains computationally prohibitive for consumer hardware. We introduce MERGE, an efficient framework that makes evolutionary merging feasible on a single GPU by reducing fitness computation costs 50 while preserving performance. MERGE achieves this by Extracting a reduced dataset for evaluation, Estimating model abilities using Item Response Theory (IRT), and Evolving optimal merges via IRT-based performance estimators. Our method enables state-of-the-art multilingual and cross-lingual merging, transferring knowledge across languages with significantly lower computational overhead. We provide theoretical guarantees and an open-source library, democratizing high-quality model merging.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsEvolutionary Algorithms and Applications · Recommender Systems and Techniques
