Soft Prompting for Unlearning in Large Language Models

Karuna Bhaila; Minh-Hao Van; Xintao Wu

arXiv:2406.12038·cs.CL·August 7, 2024

Soft Prompting for Unlearning in Large Language Models

Karuna Bhaila, Minh-Hao Van, Xintao Wu

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces SPUL, a lightweight soft prompting method enabling large language models to unlearn specific data subsets at inference time, balancing utility and forgetting without updating model weights.

Contribution

The paper proposes a novel soft prompting framework, SPUL, for efficient unlearning in LLMs, offering a scalable and parameter-free alternative to fine-tuning methods.

Findings

01

SPUL effectively unlearns specific data with minimal utility loss.

02

The method scales across multiple LLM architectures.

03

Hyperparameter choices influence unlearning effectiveness.

Abstract

The widespread popularity of Large Language Models (LLMs), partly due to their unique ability to perform in-context learning, has also brought to light the importance of ethical and safety considerations when deploying these pre-trained models. In this work, we focus on investigating machine unlearning for LLMs motivated by data protection regulations. In contrast to the growing literature on fine-tuning methods to achieve unlearning, we focus on a comparatively lightweight alternative called soft prompting to realize the unlearning of a subset of training data. With losses designed to enforce forgetting as well as utility preservation, our framework \textbf{S}oft \textbf{P}rompting for \textbf{U}n\textbf{l}earning (SPUL) learns prompt tokens that can be appended to an arbitrary query to induce unlearning of specific examples at inference time without updating LLM parameters. We conduct…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

karuna-bhaila/llm_unlearning
pytorchOfficial

Videos

Soft Prompting for Unlearning in Large Language Models· underline

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems

MethodsFocus