Merging in a Bottle: Differentiable Adaptive Merging (DAM) and the Path from Averaging to Automation
Thomas Gauthier-Caron, Shamane Siriwardhana, Elliot Stein, Malikeh, Ehghaghi, Charles Goddard, Mark McQuade, Jacob Solawetz, Maxime Labonne

TL;DR
This paper introduces Differentiable Adaptive Merging (DAM), a novel, efficient method for model merging that optimizes integration through scaling coefficients, offering a practical alternative to more complex approaches.
Contribution
The paper presents DAM, a differentiable, adaptive merging technique that reduces computational costs and improves model integration, compared to existing methods like evolutionary strategies and hyperparameter tuning.
Findings
Simple averaging methods perform well with similar models.
DAM outperforms some existing merging techniques in efficiency.
Model merging techniques have varying strengths depending on model similarity.
Abstract
By merging models, AI systems can combine the distinct strengths of separate language models, achieving a balance between multiple capabilities without requiring substantial retraining. However, the integration process can be intricate due to differences in training methods and fine-tuning, typically necessitating specialized knowledge and repeated refinement. This paper explores model merging techniques across a spectrum of complexity, examining where automated methods like evolutionary strategies stand compared to hyperparameter-driven approaches such as DARE, TIES-Merging and simpler methods like Model Soups. In addition, we introduce Differentiable Adaptive Merging (DAM), an efficient, adaptive merging approach as an alternative to evolutionary merging that optimizes model integration through scaling coefficients, minimizing computational demands. Our findings reveal that even…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Platforms and Economics
MethodsModel Soups
