Role Prompting Guided Domain Adaptation with General Capability Preserve   for Large Language Models

Rui Wang; Fei Mi; Yi Chen; Boyang Xue; Hongru Wang; Qi Zhu; Kam-Fai; Wong; Ruifeng Xu

arXiv:2403.02756·cs.CL·March 6, 2024·1 cites

Role Prompting Guided Domain Adaptation with General Capability Preserve for Large Language Models

Rui Wang, Fei Mi, Yi Chen, Boyang Xue, Hongru Wang, Qi Zhu, Kam-Fai, Wong, Ruifeng Xu

PDF

Open Access

TL;DR

This paper introduces REGA, a novel multi-domain adaptation strategy for large language models that preserves general capabilities while improving domain-specific performance by managing catastrophic forgetting and inter-domain confusion.

Contribution

REGA combines self-distillation, role prompting, and role integration to effectively adapt LLMs to multiple domains without sacrificing their general capabilities.

Findings

01

REGA reduces catastrophic forgetting in multi-domain LLM adaptation.

02

REGA improves domain-specific performance over standard fine-tuning.

03

REGA maintains strong general capabilities while adapting to specific domains.

Abstract

The growing interest in Large Language Models (LLMs) for specialized applications has revealed a significant challenge: when tailored to specific domains, LLMs tend to experience catastrophic forgetting, compromising their general capabilities and leading to a suboptimal user experience. Additionally, crafting a versatile model for multiple domains simultaneously often results in a decline in overall performance due to confusion between domains. In response to these issues, we present the RolE Prompting Guided Multi-Domain Adaptation (REGA) strategy. This novel approach effectively manages multi-domain LLM adaptation through three key components: 1) Self-Distillation constructs and replays general-domain exemplars to alleviate catastrophic forgetting. 2) Role Prompting assigns a central prompt to the general domain and a unique role prompt to each specific domain to minimize…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Domain Adaptation and Few-Shot Learning