RWKU: Benchmarking Real-World Knowledge Unlearning for Large Language Models
Zhuoran Jin, Pengfei Cao, Chenhao Wang, Zhitao He, Hongbang Yuan,, Jiachun Li, Yubo Chen, Kang Liu, Jun Zhao

TL;DR
This paper introduces RWKU, a comprehensive benchmark for evaluating real-world knowledge unlearning in large language models, focusing on practical, challenging scenarios involving famous individuals without access to training data.
Contribution
We propose a new benchmark for LLM unlearning that considers realistic constraints and evaluates multiple aspects of unlearning effectiveness and knowledge retention.
Findings
Unlearning performance varies significantly across methods.
Popular knowledge is widely retained in LLMs after unlearning.
The benchmark provides a rigorous evaluation framework for real-world unlearning.
Abstract
Large language models (LLMs) inevitably memorize sensitive, copyrighted, and harmful knowledge from the training corpus; therefore, it is crucial to erase this knowledge from the models. Machine unlearning is a promising solution for efficiently removing specific knowledge by post hoc modifying models. In this paper, we propose a Real-World Knowledge Unlearning benchmark (RWKU) for LLM unlearning. RWKU is designed based on the following three key factors: (1) For the task setting, we consider a more practical and challenging unlearning setting, where neither the forget corpus nor the retain corpus is accessible. (2) For the knowledge source, we choose 200 real-world famous people as the unlearning targets and show that such popular knowledge is widely present in various LLMs. (3) For the evaluation framework, we design the forget set and the retain set to evaluate the model's…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques
MethodsSparse Evolutionary Training · High-Order Consensuses
