Mitigating Catastrophic Forgetting in Large Language Models with Self-Synthesized Rehearsal
Jianheng Huang, Leyang Cui, Ante Wang, Chengyi Yang, Xinting Liao,, Linfeng Song, Junfeng Yao, Jinsong Su

TL;DR
This paper introduces Self-Synthesized Rehearsal (SSR), a novel method that uses LLMs to generate and refine synthetic data for continual learning, effectively mitigating catastrophic forgetting without relying on original training data.
Contribution
The paper proposes SSR, a new framework that leverages LLMs to generate and refine synthetic rehearsal data, addressing data availability issues in continual learning scenarios.
Findings
SSR achieves comparable or better performance than traditional methods.
SSR is more data-efficient and preserves generalization capabilities.
Experimental results validate SSR's effectiveness in various tasks.
Abstract
Large language models (LLMs) suffer from catastrophic forgetting during continual learning. Conventional rehearsal-based methods rely on previous training data to retain the model's ability, which may not be feasible in real-world applications. When conducting continual learning based on a publicly-released LLM checkpoint, the availability of the original training data may be non-existent. To address this challenge, we propose a framework called Self-Synthesized Rehearsal (SSR) that uses the LLM to generate synthetic instances for rehearsal. Concretely, we first employ the base LLM for in-context learning to generate synthetic instances. Subsequently, we utilize the latest LLM to refine the instance outputs based on the synthetic inputs, preserving its acquired ability. Finally, we select diverse high-quality synthetic instances for rehearsal in future stages. Experimental results…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsTopic Modeling · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications
MethodsBalanced Selection
