Concrete Subspace Learning based Interference Elimination for Multi-task   Model Fusion

Anke Tang; Li Shen; Yong Luo; Liang Ding; Han Hu; Bo Du; Dacheng Tao

arXiv:2312.06173·cs.LG·December 12, 2023·1 cites

Concrete Subspace Learning based Interference Elimination for Multi-task Model Fusion

Anke Tang, Li Shen, Yong Luo, Liang Ding, Han Hu, Bo Du, Dacheng Tao

PDF

Open Access 1 Repo 2 Datasets

TL;DR

This paper introduces a novel Concrete subspace learning method for multi-task model fusion, effectively eliminating interference among task-specific models by identifying shared low-dimensional subspaces through a meta-learning approach.

Contribution

It proposes a bi-level optimization framework using gradient-based meta-learning to find a shared subspace mask, improving multi-task model merging without significant performance loss.

Findings

01

Effective interference elimination demonstrated on vision and language tasks

02

Outperforms existing model merging techniques in experiments

03

Code availability facilitates reproducibility and further research

Abstract

Merging models fine-tuned from a common, extensively pre-trained large model but specialized for different tasks has been demonstrated as a cheap and scalable strategy to construct a multi-task model that performs well across diverse tasks. Recent research, exemplified by task arithmetic, highlights that this multi-task model can be derived through arithmetic operations on task vectors. Nevertheless, current merging techniques frequently resolve potential conflicts among parameters from task-specific models by evaluating individual attributes, such as the parameters' magnitude or sign, overlooking their collective impact on the overall functionality of the model. In this work, we propose the CONtinuous relaxation of disCRETE (Concrete) subspace learning method to identify a common low-dimensional subspace and utilize its shared information to track the interference problem without…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tanganke/subspace_fusion
pytorchOfficial

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Machine Learning and ELM

MethodsFocus