Role Prompting Guided Domain Adaptation with General Capability Preserve for Large Language Models
Rui Wang, Fei Mi, Yi Chen, Boyang Xue, Hongru Wang, Qi Zhu, Kam-Fai, Wong, Ruifeng Xu

TL;DR
This paper introduces REGA, a novel multi-domain adaptation strategy for large language models that preserves general capabilities while improving domain-specific performance by managing catastrophic forgetting and inter-domain confusion.
Contribution
REGA combines self-distillation, role prompting, and role integration to effectively adapt LLMs to multiple domains without sacrificing their general capabilities.
Findings
REGA reduces catastrophic forgetting in multi-domain LLM adaptation.
REGA improves domain-specific performance over standard fine-tuning.
REGA maintains strong general capabilities while adapting to specific domains.
Abstract
The growing interest in Large Language Models (LLMs) for specialized applications has revealed a significant challenge: when tailored to specific domains, LLMs tend to experience catastrophic forgetting, compromising their general capabilities and leading to a suboptimal user experience. Additionally, crafting a versatile model for multiple domains simultaneously often results in a decline in overall performance due to confusion between domains. In response to these issues, we present the RolE Prompting Guided Multi-Domain Adaptation (REGA) strategy. This novel approach effectively manages multi-domain LLM adaptation through three key components: 1) Self-Distillation constructs and replays general-domain exemplars to alleviate catastrophic forgetting. 2) Role Prompting assigns a central prompt to the general domain and a unique role prompt to each specific domain to minimize…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Domain Adaptation and Few-Shot Learning
