Unlearning the Unpromptable: Prompt-free Instance Unlearning in Diffusion Models
Kyungryeol Lee, Kyeonghyun Lee, Seongmin Hong, Byung Hyun Lee, Se Young Chun

TL;DR
This paper introduces a novel prompt-free unlearning method for diffusion models that effectively forget undesired outputs like faces or culturally inaccurate images without relying on text prompts, enhancing privacy and ethical compliance.
Contribution
The paper presents a surrogate-based unlearning approach using image editing, timestep-aware weighting, and gradient surgery for unpromptable output removal in diffusion models.
Findings
Successfully unlearns unpromptable outputs like faces and cultural inaccuracies
Preserves model integrity and performance after unlearning
Outperforms prompt-based and prompt-free baselines in experiments
Abstract
Machine unlearning aims to remove specific outputs from trained models, often at the concept level, such as forgetting all occurrences of a particular celebrity or filtering content via text prompts. However, many undesired outputs, such as an individual's face or generations culturally or factually misinterpreted, cannot often be specified by text prompts. We address this underexplored setting of instance unlearning for outputs that are undesired but unpromptable, where the goal is to forget target outputs selectively while preserving the rest. To this end, we introduce an effective surrogate-based unlearning method that leverages image editing, timestep-aware weighting, and gradient surgery to guide trained diffusion models toward forgetting specific outputs. Experiments on conditional (Stable Diffusion 3) and unconditional (DDPM-CelebA) diffusion models demonstrate that our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Face recognition and analysis · Hate Speech and Cyberbullying Detection
