SHRED: Retain-Set-Free Unlearning via Self-Distillation with Logit Demotion
Zizhao Hu, Ameya Godbole, Johnny Tian-Zheng Wei, Mohammad Rostami, Jesse Thomason, Robin Jia

TL;DR
SHRED is a retain-set-free unlearning method for large language models that selectively demotes memorized tokens using self-distillation, achieving a better balance between forgetting specific content and maintaining overall utility.
Contribution
SHRED introduces a novel retain-set-free unlearning approach that leverages token-level information and self-distillation to efficiently forget memorized content without extra data dependencies.
Findings
SHRED outperforms retain-set-dependent methods on standard benchmarks.
It achieves a superior trade-off between forget efficacy and model utility.
SHRED is robust against relearning and membership-inference attacks.
Abstract
Machine unlearning for large language models (LLMs) aims to selectively remove memorized content such as private data, copyrighted text, or hazardous knowledge, without costly full retraining. Most existing methods require a retain set of curated examples to prevent catastrophic degradation of general model utility, creating an extra data dependency that complicates deployment. We propose SHRED (Self-distillation via High-surprisal-only Retain-set-free Entropy Demotion), a retain-set-free unlearning method built on a key insight: not all tokens within a forget set instance carry memorized information equally. High-information tokens concentrate the model's memorized knowledge, while low-information tokens reflect general language competence. SHRED operates in two stages. (1) Selection: We perform a forward pass on a forget set instance, collect per-token autoregressive probabilities,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
