UCD: Unlearning in LLMs via Contrastive Decoding

Vinith M. Suriyakumar; Ayush Sekhari; Ashia Wilson

arXiv:2506.12097·cs.CL·June 17, 2025

UCD: Unlearning in LLMs via Contrastive Decoding

Vinith M. Suriyakumar, Ayush Sekhari, Ashia Wilson

PDF

Open Access

TL;DR

This paper introduces a novel inference-time unlearning method for large language models using contrastive decoding with auxiliary models, effectively removing specific information while maintaining overall performance.

Contribution

It presents a new contrastive decoding approach that improves unlearning efficiency and effectiveness in large language models at inference time.

Findings

01

Significant improvement in forget quality and retained performance

02

Effective removal of specific information from LLMs

03

Outperforms prior unlearning methods on benchmark datasets

Abstract

Machine unlearning aims to remove specific information, e.g. sensitive or undesirable content, from large language models (LLMs) while preserving overall performance. We propose an inference-time unlearning algorithm that uses contrastive decoding, leveraging two auxiliary smaller models, one trained without the forget set and one trained with it, to guide the outputs of the original model using their difference during inference. Our strategy substantially improves the tradeoff between unlearning effectiveness and model utility. We evaluate our approach on two unlearning benchmarks, TOFU and MUSE. Results show notable gains in both forget quality and retained performance in comparison to prior approaches, suggesting that incorporating contrastive decoding can offer an efficient, practical avenue for unlearning concepts in large-scale models.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Mathematics, Computing, and Information Processing

MethodsSparse Evolutionary Training