Continual Unlearning for Text-to-Image Diffusion Models: A Regularization Perspective

Justin Lee; Zheda Mai; Jinsu Yoo; Chongyu Fan; Cheng Zhang; Wei-Lun Chao

arXiv:2511.07970·cs.LG·March 4, 2026

Continual Unlearning for Text-to-Image Diffusion Models: A Regularization Perspective

Justin Lee, Zheda Mai, Jinsu Yoo, Chongyu Fan, Cheng Zhang, Wei-Lun Chao

PDF

Open Access 3 Reviews

TL;DR

This paper investigates the challenge of continual unlearning in text-to-image diffusion models, highlighting issues of utility collapse due to parameter drift and proposing regularization techniques to improve model retention and concept preservation.

Contribution

It is the first systematic study of continual unlearning in diffusion models, introducing regularization methods, including a gradient-projection approach, to mitigate parameter drift and improve unlearning performance.

Findings

01

Regularization reduces utility collapse during continual unlearning.

02

Semantic-aware regularizers better preserve concepts near unlearning targets.

03

Gradient projection significantly enhances unlearning effectiveness.

Abstract

Machine unlearning--the ability to remove designated concepts from a pre-trained model--has advanced rapidly, particularly for text-to-image diffusion models. However, existing methods typically assume that unlearning requests arrive all at once, whereas in practice they often arrive sequentially. We present the first systematic study of continual unlearning in text-to-image diffusion models and show that popular unlearning methods suffer from rapid utility collapse: after only a few requests, models forget retained knowledge and generate degraded images. We trace this failure to cumulative parameter drift from the pre-training weights and argue that regularization is crucial to addressing it. To this end, we study a suite of add-on regularizers that (1) mitigate drift and (2) remain compatible with existing unlearning methods. Beyond generic regularizers, we show that semantic…

Peer Reviews

Decision·ICLR 2026 Poster

Reviewer 01Rating 4Confidence 3

Strengths

- Formally define and analyze continual unlearning in the text-to-image setting. - Provides both theoretical and empirical insights into performance collapse due to parameter drift. - Proposes modular regularization and semantic-aware techniques that can easily integrate with existing unlearning methods. - Gradient Projection method effectively improves in-domain retention and reduces collateral forgetting.

Weaknesses

- The study is limited to style and object deletions; it does not evaluate more practically relevant concepts such as NSFW, copyrighted, or identity-based content as previous works. - All experiments are conducted on a single diffusion model within the UnlearnCanvas benchmark. The paper does not assess whether the proposed regularizers and gradient projection method generalize to other architectures or larger-scale diffusion models - The benchmark setup relies on a limited base model and a relat

Reviewer 02Rating 6Confidence 4

Strengths

1. Clear problem definition of continual unlearning with precise requirements for erasing targets, preserving prior removals, and retaining unrelated capabilities, plus explicit metrics for unlearning accuracy and retention accuracy split into in-domain and cross-domain. 2. Practical plug-and-play remedies that integrate with existing unlearning methods, including L1 or L2 update penalties, selective fine-tuning, and model merging via TIES, which reduce drift and improve retention. 3. The gr

Weaknesses

1. Sensitivity to choices such as the strength of L1 or L2 penalties, the top k percent for selective updates, and the number and selection rule for auxiliary concepts in gradient projection is not fully characterized. 2. Limited cost analysis for independent unlearning plus merging and for importance computation in selective tuning. 3. The paper does not discuss several closely related recent works that address multi-concept and efficient forgetting, such as Sculpting Memory: Multi-Concept Fo

Reviewer 03Rating 6Confidence 3

Strengths

1. This paper presents an interesting and valuable study on continual unlearning in text-to-image diffusion models. 2. This paper is very well presented. 3. The paper conducts a detailed analysis of the challenges faced by continual unlearning in text-to-image diffusion models through a series of experiments.

Weaknesses

The author emphasizes that this paper does not propose new algorithms but focuses on the analysis of continual unlearning. I have the following questions about this paper: 1. Compared to regular continual learning, what are the additional challenges of continual unlearning? Parameter drift and conceptual confusion have been extensively studied in continual learning. Results in figure 3 separate the unleaning target from the retention target, but can also be interpreted as follows: as the number

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Domain Adaptation and Few-Shot Learning · Model Reduction and Neural Networks