MAP: Low-compute Model Merging with Amortized Pareto Fronts via   Quadratic Approximation

Lu Li; Tianyu Zhang; Zhiqi Bu; Suyuchen Wang; Huan He; Jie Fu; Yonghui; Wu; Jiang Bian; Yong Chen; Yoshua Bengio

arXiv:2406.07529·cs.LG·April 28, 2025

MAP: Low-compute Model Merging with Amortized Pareto Fronts via Quadratic Approximation

Lu Li, Tianyu Zhang, Zhiqi Bu, Suyuchen Wang, Huan He, Jie Fu, Yonghui, Wu, Jiang Bian, Yong Chen, Yoshua Bengio

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper presents MAP, a low-compute algorithm for model merging that efficiently identifies Pareto fronts of trade-offs between multiple tasks, enabling flexible multi-task model optimization.

Contribution

Introduces MAP, a novel quadratic approximation-based method for low-cost Pareto front estimation in model merging, applicable to vision and NLP tasks.

Findings

01

Accurately identifies Pareto fronts in multi-task model merging.

02

Provides flexible solutions balancing task objectives.

03

Reduces computational cost with Bayesian and Nested MAP variants.

Abstract

Model merging has emerged as an effective approach to combine multiple single-task models into a multitask model. This process typically involves computing a weighted average of the model parameters without any additional training. Existing model-merging methods focus on enhancing average task accuracy. However, interference and conflicts between the objectives of different tasks can lead to trade-offs during the merging process. In real-world applications, a set of solutions with various trade-offs can be more informative, helping practitioners make decisions based on diverse preferences. In this paper, we introduce a novel and low-compute algorithm, Model Merging with Amortized Pareto Front (MAP). MAP efficiently identifies a Pareto set of scaling coefficients for merging multiple models, reflecting the trade-offs involved. It amortizes the substantial computational cost of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

luli-git/MAP
pytorchOfficial

Videos

MAP: Low-compute Model Merging with Amortized Pareto Fronts via Quadratic Approximation· slideslive

Taxonomy

TopicsSimulation Techniques and Applications · Scientific Computing and Data Management · Model-Driven Software Engineering Techniques

MethodsSparse Evolutionary Training · Focus