More Than Catastrophic Forgetting: Integrating General Capabilities For Domain-Specific LLMs
Chengyuan Liu, Yangyang Kang, Shihang Wang, Lizhi Qing, Fubang Zhao,, Changlong Sun, Kun Kuang, Fei Wu

TL;DR
This paper addresses the challenge of integrating general capabilities with domain-specific knowledge in LLMs, introducing a new framework called GCI and a method named ALoRA to improve performance on domain tasks.
Contribution
It introduces the concept of General Capabilities Integration (GCI) for harmonizing general and domain-specific skills in LLMs, along with the ALoRA method for dynamic information transfer.
Findings
GCI effectively combines general and domain-specific capabilities.
ALoRA improves information transfer and task performance.
Experiments validate the significance and effectiveness of the proposed approach.
Abstract
The performance on general tasks decreases after Large Language Models (LLMs) are fine-tuned on domain-specific tasks, the phenomenon is known as Catastrophic Forgetting (CF). However, this paper presents a further challenge for real application of domain-specific LLMs beyond CF, called General Capabilities Integration (GCI), which necessitates the integration of both the general capabilities and domain knowledge within a single instance. The objective of GCI is not merely to retain previously acquired general capabilities alongside new domain knowledge, but to harmonize and utilize both sets of skills in a cohesive manner to enhance performance on domain-specific tasks. Taking legal domain as an example, we carefully design three groups of training and testing tasks without lacking practicability, and construct the corresponding datasets. To better incorporate general capabilities…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsLibrary Science and Information Systems · Natural Language Processing Techniques · Digital and Traditional Archives Management
MethodsAttention Is All You Need · Softmax · Linear Layer · Multi-Head Attention
