Digital Forgetting in Large Language Models: A Survey of Unlearning   Methods

Alberto Blanco-Justicia; Najeeb Jebreel; Benet Manzanares; David; S\'anchez; Josep Domingo-Ferrer; Guillem Collell; Kuan Eeik Tan

arXiv:2404.02062·cs.CR·January 14, 2025·2 cites

Digital Forgetting in Large Language Models: A Survey of Unlearning Methods

Alberto Blanco-Justicia, Najeeb Jebreel, Benet Manzanares, David, S\'anchez, Josep Domingo-Ferrer, Guillem Collell, Kuan Eeik Tan

PDF

Open Access

TL;DR

This survey comprehensively reviews methods for digital forgetting in large language models, emphasizing unlearning techniques that aim to efficiently remove undesired knowledge while preserving model performance.

Contribution

It provides a detailed taxonomy and comparison of unlearning approaches for LLMs, along with evaluation datasets, metrics, and discussion of current challenges.

Findings

01

Unlearning methodologies are the current state-of-the-art for digital forgetting in LLMs.

02

Effective forgetting requires balancing removal of undesired knowledge with retention of useful capabilities.

03

Scalability and efficiency are critical challenges in implementing unlearning methods.

Abstract

The objective of digital forgetting is, given a model with undesirable knowledge or behavior, obtain a new model where the detected issues are no longer present. The motivations for forgetting include privacy protection, copyright protection, elimination of biases and discrimination, and prevention of harmful content generation. Effective digital forgetting has to be effective (meaning how well the new model has forgotten the undesired knowledge/behavior), retain the performance of the original model on the desirable tasks, and be scalable (in particular forgetting has to be more efficient than retraining from scratch on just the tasks/data to be retained). This survey focuses on forgetting in large language models (LLMs). We first provide background on LLMs, including their components, the types of LLMs, and their usual training pipeline. Second, we describe the motivations, types, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Natural Language Processing Techniques