A Survey on Unlearning in Large Language Models
Ruichen Qiu, Jiajun Tan, Jiayue Pu, Honglin Wang, Xiao-Shan Gao, Fei Sun

TL;DR
This survey reviews over 180 papers on unlearning in large language models, introducing a new taxonomy, analyzing evaluation methods, and discussing future challenges to improve secure AI development.
Contribution
It presents a novel taxonomy for LLM unlearning methods, offers a comprehensive analysis of evaluation paradigms, and guides future research directions.
Findings
Introduces a taxonomy categorizing unlearning methods by intervention phase
Analyzes 18 datasets and 10 categories of knowledge metrics
Provides insights into current challenges and future research directions
Abstract
Large Language Models (LLMs) demonstrate remarkable capabilities, but their training on massive corpora poses significant risks from memorized sensitive information. To mitigate these issues and align with legal standards, unlearning has emerged as a critical technique to selectively erase specific knowledge from LLMs without compromising their overall performance. This survey provides a systematic review of over 180 papers on LLM unlearning published since 2021. First, it introduces a novel taxonomy that categorizes unlearning methods based on the phase in the LLM pipeline of the intervention. This framework further distinguishes between parameter modification and parameter selection strategies, thus enabling deeper insights and more informed comparative analysis. Second, it offers a multidimensional analysis of evaluation paradigms. For datasets, we compare 18 existing benchmarks from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
