Opt-Out: Investigating Entity-Level Unlearning for Large Language Models via Optimal Transport
Minseok Choi, Daniel Rim, Dohyun Lee, Jaegul Choo

TL;DR
This paper introduces Opt-Out, a novel optimal transport-based method for entity-level unlearning in large language models, enabling effective removal of specific entity knowledge while maintaining overall model performance.
Contribution
The paper proposes a new unlearning technique using Wasserstein distance and introduces the first dataset for evaluating entity-level unlearning in LLMs.
Findings
Opt-Out outperforms existing unlearning methods.
It achieves more precise and effective removal of entity knowledge.
The method reduces the need for full model retraining.
Abstract
Instruction-following large language models (LLMs), such as ChatGPT, have become widely popular among everyday users. However, these models inadvertently disclose private, sensitive information to their users, underscoring the need for machine unlearning techniques to remove selective information from the models. While prior work has focused on forgetting small, random subsets of training data at the instance-level, we argue that real-world scenarios often require the removal of an entire user data, which may require a more careful maneuver. In this study, we explore entity-level unlearning, which aims to erase all knowledge related to a target entity while preserving the remaining model capabilities. To address this, we introduce Opt-Out, an optimal transport-based unlearning method that utilizes the Wasserstein distance from the model's initial parameters to achieve more effective and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis
