RWKU: Benchmarking Real-World Knowledge Unlearning for Large Language   Models

Zhuoran Jin; Pengfei Cao; Chenhao Wang; Zhitao He; Hongbang Yuan,; Jiachun Li; Yubo Chen; Kang Liu; Jun Zhao

arXiv:2406.10890·cs.CL·June 18, 2024·1 cites

RWKU: Benchmarking Real-World Knowledge Unlearning for Large Language Models

Zhuoran Jin, Pengfei Cao, Chenhao Wang, Zhitao He, Hongbang Yuan,, Jiachun Li, Yubo Chen, Kang Liu, Jun Zhao

PDF

Open Access 1 Repo 1 Datasets 1 Video

TL;DR

This paper introduces RWKU, a comprehensive benchmark for evaluating real-world knowledge unlearning in large language models, focusing on practical, challenging scenarios involving famous individuals without access to training data.

Contribution

We propose a new benchmark for LLM unlearning that considers realistic constraints and evaluates multiple aspects of unlearning effectiveness and knowledge retention.

Findings

01

Unlearning performance varies significantly across methods.

02

Popular knowledge is widely retained in LLMs after unlearning.

03

The benchmark provides a rigorous evaluation framework for real-world unlearning.

Abstract

Large language models (LLMs) inevitably memorize sensitive, copyrighted, and harmful knowledge from the training corpus; therefore, it is crucial to erase this knowledge from the models. Machine unlearning is a promising solution for efficiently removing specific knowledge by post hoc modifying models. In this paper, we propose a Real-World Knowledge Unlearning benchmark (RWKU) for LLM unlearning. RWKU is designed based on the following three key factors: (1) For the task setting, we consider a more practical and challenging unlearning setting, where neither the forget corpus nor the retain corpus is accessible. (2) For the knowledge source, we choose 200 real-world famous people as the unlearning targets and show that such popular knowledge is widely present in various LLMs. (3) For the evaluation framework, we design the forget set and the retain set to evaluate the model's…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jinzhuoran/rwku
pytorchOfficial

Datasets

jinzhuoran/RWKU
dataset· 2.6k dl
2.6k dl

Videos

RWKU: Benchmarking Real-World Knowledge Unlearning for Large Language Models· slideslive

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques

MethodsSparse Evolutionary Training · High-Order Consensuses