Model Merging in the Essential Subspace
Longhua Li, Lei Qi, Qi Tian, Xin Geng

TL;DR
This paper introduces ESM, a novel model merging framework that uses PCA to identify essential subspaces, reducing task interference and effectively combining multiple fine-tuned models into a single multi-task model.
Contribution
The paper proposes a new model merging method leveraging PCA-based essential subspaces and polarized scaling to improve multi-task model integration without additional training.
Findings
Achieves state-of-the-art performance in multi-task model merging.
Effectively reduces task interference during model merging.
Preserves core task-specific knowledge in merged models.
Abstract
Model merging aims to integrate multiple task-specific fine-tuned models derived from a shared pre-trained checkpoint into a single multi-task model without additional training. Despite extensive research, task interference remains a major obstacle that often undermines the performance of merged models. In this paper, we propose ESM (Essential Subspace Merging) , a robust framework for effective model merging. We begin by performing Principal Component Analysis (PCA) on feature shifts induced by parameter updates. The resulting principal directions span an essential subspace that dominantly influences feature representations. Each task's parameter update matrix is projected onto its respective essential subspace for low-rank decomposition before merging. This methodology mitigates inter-task interference while preserving core task-specific functionality. Furthermore, we introduce a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Adversarial Robustness in Machine Learning · Advanced Neural Network Applications
