BYOM: Building Your Own Multi-Task Model For Free

Weisen Jiang; Baijiong Lin; Han Shi; Yu Zhang; Zhenguo Li; and James T. Kwok

arXiv:2310.01886·cs.LG·February 6, 2024·2 cites

BYOM: Building Your Own Multi-Task Model For Free

Weisen Jiang, Baijiong Lin, Han Shi, Yu Zhang, Zhenguo Li, and James T. Kwok

PDF

Open Access

TL;DR

This paper introduces two data-free, parameter-efficient methods, BYOM-FFT and BYOM-LoRA, for building multi-task models by merging task-specific models, significantly improving performance over existing methods.

Contribution

The paper presents novel, data-free approaches for merging finetuned models into multi-task models, enhancing performance and efficiency.

Findings

01

BYOM methods outperform existing merging techniques

02

BYOM-FFT can be integrated into other methods for better results

03

Proven effectiveness on vision and NLP tasks

Abstract

Recently, various merging methods have been proposed to build a multi-task model from task-specific finetuned models without retraining. However, existing methods suffer from a large performance deterioration compared to using multiple task-specific models. In this paper, we propose to inject task-specific knowledge into the merged model and design two parameter-efficient approaches (BYOM-FFT and BYOM-LoRA) to Build Your Own Multi-task model. BYOM-FFT is for merging fully finetuned models, while BYOM-LoRA is for LoRA-finetuned models. Both methods are data-free and computation-efficient. Extensive experiments on computer vision and natural language processing tasks show that the proposed BYOM methods outperform existing merging methods by a large margin. Moreover, BYOM-FFT is general and can be integrated into existing merging methods to further boost performance.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComputer Graphics and Visualization Techniques