The Illusion of Forgetting: Attack Unlearned Diffusion via Initial Latent Variable Optimization

Manyi Li; Yufan Liu; Lai Jiang; Bing Li; Yuming Li; Weiming Hu

arXiv:2602.00175·cs.LG·May 8, 2026

The Illusion of Forgetting: Attack Unlearned Diffusion via Initial Latent Variable Optimization

Manyi Li, Yufan Liu, Lai Jiang, Bing Li, Yuming Li, Weiming Hu

PDF

TL;DR

This paper investigates the phenomenon of partial forgetting in diffusion models and introduces IVO, a method to challenge current unlearning techniques by reviving dormant knowledge through latent variable optimization.

Contribution

The paper explains the cause of the forgetting illusion and proposes IVO, a novel attack framework that exposes flaws in existing unlearning methods for diffusion models.

Findings

01

IVO outperforms existing baselines in exposing unlearning flaws

02

Distributional discrepancy indicates retained knowledge and unlearning strength

03

Unlearning partially disrupts but does not erase internal knowledge

Abstract

Text-to-image diffusion models (DMs) are frequently abused to produce harmful or copyrighted content, violating public interests. Concept erasure (unlearning) is a promising paradigm to alleviate this issue. However, there exists a peculiar forgetting illusion phenomenon with unclear cause. Based on empirical analysis, we formally explain this cause: most unlearning partially disrupt the mapping between linguistic symbols and the underlying internal knowledge, leaving the knowledge intact as dormant memories. We further demonstrate that distributional discrepancy in the denoising process serves as a measurable indicator of how much of the mapping is retained, also reflecting unlearning strength. Inspired by this, we propose IVO (Initial Latent Variable Optimization), a novel attack framework designed to assess the robustness of current unlearning methods. IVO optimizes initial latent…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.