CURE: Concept Unlearning via Orthogonal Representation Editing in Diffusion Models

Shristi Das Biswas; Arani Roy; Kaushik Roy

arXiv:2505.12677·cs.CV·October 14, 2025

CURE: Concept Unlearning via Orthogonal Representation Editing in Diffusion Models

Shristi Das Biswas, Arani Roy, Kaushik Roy

PDF

Open Access 1 Video

TL;DR

CURE is a training-free, efficient framework that unlearns undesired concepts in pre-trained diffusion models by orthogonally editing their weights, improving safety and specificity without retraining.

Contribution

The paper introduces CURE, a novel orthogonal weight-space concept unlearning method that operates in closed-form, enabling rapid and precise removal of unwanted concepts in diffusion models.

Findings

01

Achieves faster concept removal in 2 seconds.

02

Effectively unlearns targeted concepts with minimal impact on unrelated capabilities.

03

Demonstrates robustness against red-teaming and improved safety.

Abstract

As Text-to-Image models continue to evolve, so does the risk of generating unsafe, copyrighted, or privacy-violating content. Existing safety interventions - ranging from training data curation and model fine-tuning to inference-time filtering and guidance - often suffer from incomplete concept removal, susceptibility to jail-breaking, computational inefficiency, or collateral damage to unrelated capabilities. In this paper, we introduce CURE, a training-free concept unlearning framework that operates directly in the weight space of pre-trained diffusion models, enabling fast, interpretable, and highly specific suppression of undesired concepts. At the core of our method is the Spectral Eraser, a closed-form, orthogonal projection module that identifies discriminative subspaces using Singular Value Decomposition over token embeddings associated with the concepts to forget and retain.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

CURE: Concept Unlearning via Orthogonal Representation Editing in Diffusion Models· slideslive

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Domain Adaptation and Few-Shot Learning · Face recognition and analysis

MethodsDiffusion