Exploring Training on Heterogeneous Data with Mixture of Low-rank   Adapters

Yuhang Zhou; Zihua Zhao; Haolin Li; Siyuan Du; Jiangchao Yao; Ya; Zhang; Yanfeng Wang

arXiv:2406.09679·cs.CV·June 17, 2024·1 cites

Exploring Training on Heterogeneous Data with Mixture of Low-rank Adapters

Yuhang Zhou, Zihua Zhao, Haolin Li, Siyuan Du, Jiangchao Yao, Ya, Zhang, Yanfeng Wang

PDF

Open Access 1 Repo

TL;DR

This paper introduces Mixture of Low-rank Adapters (MoLA), a novel method to effectively train unified models on heterogeneous data by mitigating conflicts through task-specific adapters and a task-wise decorrelation loss.

Contribution

It proposes two variants, MoLA-Grad and MoLA-Router, to handle target-aware and target-agnostic scenarios, advancing multi-task learning with low-rank adapters.

Findings

01

MoLA outperforms previous state-of-the-art methods.

02

MoLA effectively mitigates training conflicts among heterogeneous data.

03

In-depth analysis reveals the working mechanism of MoLA.

Abstract

Training a unified model to take multiple targets into account is a trend towards artificial general intelligence. However, how to efficiently mitigate the training conflicts among heterogeneous data collected from different domains or tasks remains under-explored. In this study, we explore to leverage Mixture of Low-rank Adapters (MoLA) to mitigate conflicts in heterogeneous data training, which requires to jointly train the multiple low-rank adapters and their shared backbone. Specifically, we introduce two variants of MoLA, namely, MoLA-Grad and MoLA-Router, to respectively handle the target-aware and target-agnostic scenarios during inference. The former uses task identifiers to assign personalized low-rank adapters to each task, disentangling task-specific knowledge towards their adapters, thereby mitigating heterogeneity conflicts. The latter uses a novel Task-wise Decorrelation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mediabrain-sjtu/mola
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBayesian Methods and Mixture Models · Gaussian Processes and Bayesian Inference