Parameter Efficient Multi-task Model Fusion with Partial Linearization

Anke Tang; Li Shen; Yong Luo; Yibing Zhan; Han Hu; Bo Du; Yixin Chen,; Dacheng Tao

arXiv:2310.04742·cs.LG·March 12, 2024

Parameter Efficient Multi-task Model Fusion with Partial Linearization

Anke Tang, Li Shen, Yong Luo, Yibing Zhan, Han Hu, Bo Du, Yixin Chen,, Dacheng Tao

PDF

Open Access 1 Repo

TL;DR

This paper introduces a partial linearization method for multi-task model fusion that enhances the efficiency and effectiveness of combining fine-tuned large pre-trained models for multiple tasks.

Contribution

It proposes a novel partial linearization technique for adapter modules that improves multi-task fusion in parameter-efficient fine-tuning methods like LoRA.

Findings

01

Outperforms standard adapter tuning in multi-task settings

02

Enables more effective fusion of multiple fine-tuned task vectors

03

Demonstrates scalability with increasing number of tasks

Abstract

Large pre-trained models have enabled significant advances in machine learning and served as foundation components. Model fusion methods, such as task arithmetic, have been proven to be powerful and scalable to incorporate fine-tuned weights from different tasks into a multi-task model. However, efficiently fine-tuning large pre-trained models on multiple downstream tasks remains challenging, leading to inefficient multi-task model fusion. In this work, we propose a novel method to improve multi-task fusion for parameter-efficient fine-tuning techniques like LoRA fine-tuning. Specifically, our approach partially linearizes only the adapter modules and applies task arithmetic over the linearized adapters. This allows us to leverage the the advantages of model fusion over linearized fine-tuning, while still performing fine-tuning and inference efficiently. We demonstrate that our partial…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tanganke/peta
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Fault Detection and Control Systems

MethodsAdapter