Continual Learning for Large Language Models: A Survey

Tongtong Wu; Linhao Luo; Yuan-Fang Li; Shirui Pan; Thuy-Trang Vu,; Gholamreza Haffari

arXiv:2402.01364·cs.CL·February 8, 2024·23 cites

Continual Learning for Large Language Models: A Survey

Tongtong Wu, Linhao Luo, Yuan-Fang Li, Shirui Pan, Thuy-Trang Vu,, Gholamreza Haffari

PDF

Open Access

TL;DR

This survey reviews recent methods for enabling large language models to learn continuously without full retraining, categorizing techniques and discussing challenges and future directions.

Contribution

It introduces a novel multi-staged categorization scheme for continual learning techniques specific to LLMs and compares them with other adaptation strategies.

Findings

01

Catalogs continual learning techniques in three stages

02

Contrasts continual learning with retrieval-augmented methods

03

Identifies key challenges and future research directions

Abstract

Large language models (LLMs) are not amenable to frequent re-training, due to high training costs arising from their massive scale. However, updates are necessary to endow LLMs with new skills and keep them up-to-date with rapidly evolving human knowledge. This paper surveys recent works on continual learning for LLMs. Due to the unique nature of LLMs, we catalog continue learning techniques in a novel multi-staged categorization scheme, involving continual pretraining, instruction tuning, and alignment. We contrast continual learning for LLMs with simpler adaptation methods used in smaller models, as well as with other enhancement strategies like retrieval-augmented generation and model editing. Moreover, informed by a discussion of benchmarks and evaluation, we identify several challenges and future work directions for this crucial task.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Natural Language Processing Techniques · Topic Modeling