TOFU: A Task of Fictitious Unlearning for LLMs

Pratyush Maini; Zhili Feng; Avi Schwarzschild; Zachary C. Lipton; J.; Zico Kolter

arXiv:2401.06121·cs.LG·January 12, 2024·5 cites

TOFU: A Task of Fictitious Unlearning for LLMs

Pratyush Maini, Zhili Feng, Avi Schwarzschild, Zachary C. Lipton, J., Zico Kolter

PDF

Open Access 5 Repos 5 Models 5 Datasets

TL;DR

This paper introduces TOFU, a benchmark for evaluating unlearning methods in large language models, highlighting the current limitations of existing approaches in effectively forgetting specific data.

Contribution

It provides a synthetic dataset, a suite of metrics, and baseline results to facilitate research on effective unlearning in large language models.

Findings

01

Existing unlearning algorithms are ineffective at fully forgetting data.

02

The TOFU benchmark enables systematic evaluation of unlearning methods.

03

Baseline results show a need for improved unlearning techniques.

Abstract

Large language models trained on massive corpora of data from the web can memorize and reproduce sensitive or private data raising both legal and ethical concerns. Unlearning, or tuning models to forget information present in their training data, provides us with a way to protect private data after training. Although several methods exist for such unlearning, it is unclear to what extent they result in models equivalent to those where the data to be forgotten was never learned in the first place. To address this challenge, we present TOFU, a Task of Fictitious Unlearning, as a benchmark aimed at helping deepen our understanding of unlearning. We offer a dataset of 200 diverse synthetic author profiles, each consisting of 20 question-answer pairs, and a subset of these profiles called the forget set that serves as the target for unlearning. We compile a suite of metrics that work…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Privacy-Preserving Technologies in Data · Interpreting and Communication in Healthcare

MethodsSparse Evolutionary Training