FADE: Adversarial Concept Erasure in Flow Models

Zixuan Fu; Yan Ren; Finn Carter; Chenyue Wang; Ze Niu; Dacheng Yu; Emily Davis; Bo Zhang

arXiv:2507.12283·cs.CV·July 17, 2025

FADE: Adversarial Concept Erasure in Flow Models

Zixuan Fu, Yan Ren, Finn Carter, Chenyue Wang, Ze Niu, Dacheng Yu, Emily Davis, Bo Zhang

PDF

Open Access

TL;DR

FADE is a novel method for removing specific concepts from diffusion models, enhancing privacy and fairness while maintaining image quality, through adversarial fine-tuning and theoretical guarantees.

Contribution

We introduce FADE, a concept erasure technique for diffusion models that combines trajectory-aware fine-tuning with adversarial objectives, providing formal privacy guarantees and superior removal performance.

Findings

01

FADE outperforms baselines in concept removal efficacy.

02

FADE maintains higher image fidelity after concept erasure.

03

Ablation studies confirm the importance of each component.

Abstract

Diffusion models have demonstrated remarkable image generation capabilities, but also pose risks in privacy and fairness by memorizing sensitive concepts or perpetuating biases. We propose a novel \textbf{concept erasure} method for text-to-image diffusion models, designed to remove specified concepts (e.g., a private individual or a harmful stereotype) from the model's generative repertoire. Our method, termed \textbf{FADE} (Fair Adversarial Diffusion Erasure), combines a trajectory-aware fine-tuning strategy with an adversarial objective to ensure the concept is reliably removed while preserving overall model fidelity. Theoretically, we prove a formal guarantee that our approach minimizes the mutual information between the erased concept and the model's outputs, ensuring privacy and fairness. Empirically, we evaluate FADE on Stable Diffusion and FLUX, using benchmarks from prior work…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning