Side Effects of Erasing Concepts from Diffusion Models

Shaswati Saha; Sourajit Saha; Manas Gaur; Tejas Gokhale

arXiv:2508.15124·cs.LG·September 23, 2025

Side Effects of Erasing Concepts from Diffusion Models

Shaswati Saha, Sourajit Saha, Manas Gaur, Tejas Gokhale

PDF

Open Access

TL;DR

This paper introduces the Side Effect Evaluation (SEE) benchmark to measure the robustness of concept erasure techniques in diffusion models, revealing their vulnerabilities and side effects such as attribute leakage and ease of circumvention.

Contribution

The paper presents a comprehensive benchmark and evaluation pipeline for assessing the robustness of concept erasure techniques in diffusion models, highlighting their limitations and side effects.

Findings

01

CETs can be bypassed using hierarchical and compositional prompts.

02

CETs exhibit attribute leakage and attention dispersal.

03

Side effects include impact on neighboring concepts and target evasion.

Abstract

Concerns about text-to-image (T2I) generative models infringing on privacy, copyright, and safety have led to the development of concept erasure techniques (CETs). The goal of an effective CET is to prohibit the generation of undesired "target" concepts specified by the user, while preserving the ability to synthesize high-quality images of other concepts. In this work, we demonstrate that concept erasure has side effects and CETs can be easily circumvented. For a comprehensive measurement of the robustness of CETs, we present the Side Effect Evaluation (SEE) benchmark that consists of hierarchical and compositional prompts describing objects and their attributes. The dataset and an automated evaluation pipeline quantify side effects of CETs across three aspects: impact on neighboring concepts, evasion of targets, and attribute leakage. Our experiments reveal that CETs can be…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMisinformation and Its Impacts · Adversarial Robustness in Machine Learning · Topic Modeling