MAP: Low-compute Model Merging with Amortized Pareto Fronts via Quadratic Approximation
Lu Li, Tianyu Zhang, Zhiqi Bu, Suyuchen Wang, Huan He, Jie Fu, Yonghui, Wu, Jiang Bian, Yong Chen, Yoshua Bengio

TL;DR
This paper presents MAP, a low-compute algorithm for model merging that efficiently identifies Pareto fronts of trade-offs between multiple tasks, enabling flexible multi-task model optimization.
Contribution
Introduces MAP, a novel quadratic approximation-based method for low-cost Pareto front estimation in model merging, applicable to vision and NLP tasks.
Findings
Accurately identifies Pareto fronts in multi-task model merging.
Provides flexible solutions balancing task objectives.
Reduces computational cost with Bayesian and Nested MAP variants.
Abstract
Model merging has emerged as an effective approach to combine multiple single-task models into a multitask model. This process typically involves computing a weighted average of the model parameters without any additional training. Existing model-merging methods focus on enhancing average task accuracy. However, interference and conflicts between the objectives of different tasks can lead to trade-offs during the merging process. In real-world applications, a set of solutions with various trade-offs can be more informative, helping practitioners make decisions based on diverse preferences. In this paper, we introduce a novel and low-compute algorithm, Model Merging with Amortized Pareto Front (MAP). MAP efficiently identifies a Pareto set of scaling coefficients for merging multiple models, reflecting the trade-offs involved. It amortizes the substantial computational cost of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsSimulation Techniques and Applications · Scientific Computing and Data Management · Model-Driven Software Engineering Techniques
MethodsSparse Evolutionary Training · Focus
