Z-Erase: Enabling Concept Erasure in Single-Stream Diffusion Transformers
Nanxiang Jiang, Zhaoxin Fan, Baisen Wang, Daiheng Gao, Junhang Cheng, Jifeng Guo, Yalan Qin, Yeying Jin, Hongwei Zheng, Faguo Wu, Wenjun Wu

TL;DR
Z-Erase is a novel method designed to enable safe concept erasure in single-stream diffusion transformers for text-to-image models, overcoming stability issues of prior approaches.
Contribution
It introduces the first concept erasure technique tailored for single-stream diffusion transformers, including a framework and an adaptive algorithm with convergence guarantees.
Findings
Z-Erase overcomes generation collapse in single-stream models.
It achieves state-of-the-art performance in concept erasure tasks.
The method is validated through extensive experiments.
Abstract
Concept erasure serves as a vital safety mechanism for removing unwanted concepts from text-to-image (T2I) models. While extensively studied in U-Net and dual-stream architectures (e.g., Flux), this task remains under-explored in the recent emerging paradigm of single-stream diffusion transformers (e.g., Z-Image). In this new paradigm, text and image tokens are processed as a single unified sequence via shared parameters. Consequently, directly applying prior erasure methods typically leads to generation collapse. To bridge this gap, we introduce Z-Erase, the first concept erasure method tailored for single-stream T2I models. To guarantee stable image generation, Z-Erase first proposes a Stream Disentangled Concept Erasure Framework that decouples updates and enables existing methods on single-stream models. Subsequently, within this framework, we introduce Lagrangian-Guided Adaptive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
