Bayesian Model Merging

Kaiyang Li; Shaobo Han; Qing Su; Shihao Ji

arXiv:2605.12843·cs.LG·May 14, 2026

Bayesian Model Merging

Kaiyang Li, Shaobo Han, Qing Su, Shihao Ji

PDF

TL;DR

Bayesian Model Merging (BMM) is a novel framework that efficiently combines multiple task-specific models into one, leveraging Bayesian regression and optimization, and performs well across vision and language benchmarks.

Contribution

Introduces BMM, a bi-level Bayesian optimization framework for model merging that incorporates inductive bias and hyperparameter tuning, including a data-free variant.

Findings

01

BMM outperforms existing plug-and-play baselines in vision and language tasks.

02

On the ViT-L/14 benchmark, BMM achieves 95.1 performance, close to eight separate experts.

03

BMM effectively merges up to 20 vision tasks and 5 language tasks.

Abstract

Model merging aims to combine multiple task-specific expert models into a single model without joint retraining, offering a practical alternative to multi-task learning when data access or computational budget is limited. Existing methods, however, face two key limitations: (1) they overlook the valuable inductive bias of strong anchor models and estimate the merged weights from scratch, and (2) they rely on a shared hyperparameter setting across different modules of the network, lacking a global optimization strategy. This paper introduces Bayesian Model Merging (BMM), a plug-and-play bi-level optimization framework, where the inner level formulates the model merging as an activation-based Bayesian regression under a strong prior induced by an anchor model, yielding an efficient closed-form solution; and the outer level leverages a Bayesian optimization procedure to search…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.