Decomposing the Basic Abilities of Large Language Models: Mitigating Cross-Task Interference in Multi-Task Instruct-Tuning

Bing Wang; Ximing Li; Changchun Li; Jinjin Chi; Gang Niu; Masashi Sugiyama

arXiv:2605.05676·cs.CL·May 8, 2026

Decomposing the Basic Abilities of Large Language Models: Mitigating Cross-Task Interference in Multi-Task Instruct-Tuning

Bing Wang, Ximing Li, Changchun Li, Jinjin Chi, Gang Niu, Masashi Sugiyama

PDF

TL;DR

This paper introduces BADIT, a novel method that decomposes LLM parameters into orthogonal basic abilities to reduce cross-task interference in multi-task instruct-tuning, leading to improved performance.

Contribution

The paper proposes a new parameter decomposition approach, BADIT, which models tasks as combinations of orthogonal basic abilities, addressing cross-task interference more effectively.

Findings

01

BADIT outperforms state-of-the-art methods on the SuperNI benchmark.

02

It significantly reduces cross-task interference in multi-task instruct-tuning.

03

Empirical results show improved task performance across 6 LLMs.

Abstract

Recently, the prominent performance of large language models (LLMs) has been largely driven by multi-task instruct-tuning. Unfortunately, this training paradigm suffers from a key issue, named cross-task interference, due to conflicting gradients over shared parameters among different tasks. Some previous methods mitigate this issue by isolating task-specific parameters, e.g., task-specific neuron selection and mixture-of-experts. In this paper, we empirically reveal that the cross-task interference still exists for the existing solutions because of many parameters also shared by different tasks, and accordingly, we propose a novel solution, namely Basic Abilities Decomposition for multi-task Instruct-Tuning (BADIT). Specifically, we empirically find that certain parameters are consistently co-activated, and that co-activated parameters naturally organize into base groups. This…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.