Tensorized Clustered LoRA Merging for Multi-Task Interference

Zhan Su; Fengran Mo; Guojun Liang; Jinghan Zhang; Bingbing Wen; Prayag Tiwari; Jian-Yun Nie

arXiv:2508.03999·cs.LG·August 7, 2025

Tensorized Clustered LoRA Merging for Multi-Task Interference

Zhan Su, Fengran Mo, Guojun Liang, Jinghan Zhang, Bingbing Wen, Prayag Tiwari, Jian-Yun Nie

PDF

TL;DR

This paper introduces TC-LoRA, a tensorized clustered LoRA merging method that reduces task interference in multi-task LLM adaptation by clustering samples and disentangling task-specific factors, improving performance on diverse tasks.

Contribution

The paper proposes a novel tensorized clustered LoRA approach with text-level clustering and parameter-level factorization to mitigate task interference in multi-task LLM fine-tuning.

Findings

01

Achieves +1.4% accuracy on Phi-3

02

Achieves +2.3% accuracy on Mistral-7B

03

Effectively reduces task interference in multi-task settings

Abstract

Despite the success of the monolithic dense paradigm of large language models (LLMs), the LoRA adapters offer an efficient solution by fine-tuning small task-specific modules and merging them with the base model. However, in multi-task settings, merging LoRA adapters trained on heterogeneous sources frequently causes \textit{task interference}, degrading downstream performance. To address this, we propose a tensorized clustered LoRA (TC-LoRA) library targeting to address the task interference at the \textit{text-level} and \textit{parameter-level}. At the \textit{text-level}, we cluster the training samples in the embedding space to capture input-format similarities, then train a specialized LoRA adapter for each cluster. At the \textit{parameter-level}, we introduce a joint Canonical Polyadic (CP) decomposition that disentangles task-specific and shared factors across LoRA adapters.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.