HM3: Hierarchical Multi-Objective Model Merging for Pretrained Models

Yu Zhou; Xingyu Wu; Jibin Wu; Liang Feng; Kay Chen Tan

arXiv:2409.18893·cs.LG·September 30, 2024

HM3: Hierarchical Multi-Objective Model Merging for Pretrained Models

Yu Zhou, Xingyu Wu, Jibin Wu, Liang Feng, Kay Chen Tan

PDF

Open Access 1 Video

TL;DR

This paper introduces a flexible, multi-objective model merging framework that models architecture-space merging as a reinforcement learning task, enabling customized, high-performance combined models across diverse tasks.

Contribution

It proposes a novel reinforcement learning-based approach for architecture-space model merging with multi-objective optimization for personalized task preferences.

Findings

01

Effective merging across multiple tasks demonstrated

02

Outperforms existing parameter-space merging methods

03

Provides customizable merging strategies for diverse applications

Abstract

Model merging is a technique that combines multiple large pretrained models into a single model with enhanced performance and broader task adaptability. It has gained popularity in large pretrained model development due to its ability to bypass the need for original training data and further training processes. However, most existing model merging approaches focus solely on exploring the parameter space, merging models with identical architectures. Merging within the architecture space, despite its potential, remains in its early stages due to the vast search space and the challenges of layer compatibility. This paper marks a significant advance toward more flexible and comprehensive model merging techniques by modeling the architecture-space merging process as a reinforcement learning task. We train policy and value networks using offline sampling of weight vectors, which are then…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

HM3: Hierarchical Multi-Objective Model Merging for Pretrained Models· slideslive

Taxonomy

TopicsModel-Driven Software Engineering Techniques · Simulation Techniques and Applications · Business Process Modeling and Analysis

MethodsFocus