Decomposing the Basic Abilities of Large Language Models: Mitigating Cross-Task Interference in Multi-Task Instruct-Tuning
Bing Wang, Ximing Li, Changchun Li, Jinjin Chi, Gang Niu, Masashi Sugiyama

TL;DR
This paper introduces BADIT, a novel method that decomposes LLM parameters into orthogonal basic abilities to reduce cross-task interference in multi-task instruct-tuning, leading to improved performance.
Contribution
The paper proposes a new parameter decomposition approach, BADIT, which models tasks as combinations of orthogonal basic abilities, addressing cross-task interference more effectively.
Findings
BADIT outperforms state-of-the-art methods on the SuperNI benchmark.
It significantly reduces cross-task interference in multi-task instruct-tuning.
Empirical results show improved task performance across 6 LLMs.
Abstract
Recently, the prominent performance of large language models (LLMs) has been largely driven by multi-task instruct-tuning. Unfortunately, this training paradigm suffers from a key issue, named cross-task interference, due to conflicting gradients over shared parameters among different tasks. Some previous methods mitigate this issue by isolating task-specific parameters, e.g., task-specific neuron selection and mixture-of-experts. In this paper, we empirically reveal that the cross-task interference still exists for the existing solutions because of many parameters also shared by different tasks, and accordingly, we propose a novel solution, namely Basic Abilities Decomposition for multi-task Instruct-Tuning (BADIT). Specifically, we empirically find that certain parameters are consistently co-activated, and that co-activated parameters naturally organize into base groups. This…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
