MALoRA: Mixture of Asymmetric Low-Rank Adaptation for Enhanced   Multi-Task Learning

Xujia Wang; Haiyan Zhao; Shuo Wang; Hanqing Wang; Zhiyuan Liu

arXiv:2410.22782·cs.CL·October 31, 2024

MALoRA: Mixture of Asymmetric Low-Rank Adaptation for Enhanced Multi-Task Learning

Xujia Wang, Haiyan Zhao, Shuo Wang, Hanqing Wang, Zhiyuan Liu

PDF

Open Access

TL;DR

MALoRA introduces an asymmetric low-rank adaptation framework that improves multi-task learning efficiency and stability, reducing parameters, increasing training speed, and outperforming existing methods across various tasks.

Contribution

It proposes MALoRA, a novel asymmetric optimization approach for LoRA experts, enhancing multi-task learning by reducing parameters and boosting training efficiency.

Findings

01

Reduces trainable parameters by 30-48%

02

Increases training speed by 1.2x

03

Outperforms baseline methods in diverse multi-task scenarios

Abstract

Parameter-Efficient Fine-Tuning (PEFT) methods like LoRA have significantly improved the adaptation of LLMs to downstream tasks in a resource-efficient manner. However, in multi-task scenarios, challenges such as training imbalance and the seesaw effect frequently emerge. Mixture-of-LoRA (MoLoRA), which combines LoRA with sparse Mixture-of-Experts, mitigates some of these issues by promoting task-specific learning across experts. Despite this, MoLoRA remains inefficient in terms of training speed, parameter utilization, and overall multi-task performance. In this paper, we propose Mixture of Asymmetric Low-Rank Adaptaion (MALoRA), a flexible fine-tuning framework that leverages asymmetric optimization across LoRA experts. MALoRA reduces the number of trainable parameters by 30% to 48%, increases training speed by 1.2x, and matches the computational efficiency of single-task LoRA models.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Machine Learning and ELM

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings