HM3: Hierarchical Multi-Objective Model Merging for Pretrained Models
Yu Zhou, Xingyu Wu, Jibin Wu, Liang Feng, Kay Chen Tan

TL;DR
This paper introduces a flexible, multi-objective model merging framework that models architecture-space merging as a reinforcement learning task, enabling customized, high-performance combined models across diverse tasks.
Contribution
It proposes a novel reinforcement learning-based approach for architecture-space model merging with multi-objective optimization for personalized task preferences.
Findings
Effective merging across multiple tasks demonstrated
Outperforms existing parameter-space merging methods
Provides customizable merging strategies for diverse applications
Abstract
Model merging is a technique that combines multiple large pretrained models into a single model with enhanced performance and broader task adaptability. It has gained popularity in large pretrained model development due to its ability to bypass the need for original training data and further training processes. However, most existing model merging approaches focus solely on exploring the parameter space, merging models with identical architectures. Merging within the architecture space, despite its potential, remains in its early stages due to the vast search space and the challenges of layer compatibility. This paper marks a significant advance toward more flexible and comprehensive model merging techniques by modeling the architecture-space merging process as a reinforcement learning task. We train policy and value networks using offline sampling of weight vectors, which are then…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsModel-Driven Software Engineering Techniques · Simulation Techniques and Applications · Business Process Modeling and Analysis
MethodsFocus
