Disentangled Sparse Representations for Concept-Separated Diffusion Unlearning

Hyeonjin Kim; Hangyeol Jung; Heechan Yun; Sungjun Yun; and Dong-Jun Han

arXiv:2605.12122·cs.LG·May 13, 2026

Disentangled Sparse Representations for Concept-Separated Diffusion Unlearning

Hyeonjin Kim, Hangyeol Jung, Heechan Yun, Sungjun Yun, and Dong-Jun Han

PDF

TL;DR

This paper introduces SAEParate, a novel autoencoder-based method that organizes latent representations into concept-specific clusters for more effective concept unlearning in diffusion models.

Contribution

SAEParate employs a concept-aware contrastive objective and enhanced encoder to achieve explicit concept separation and improved unlearning performance.

Findings

01

State-of-the-art unlearning performance on UnlearnCanvas

02

Strong gains in joint style-object unlearning

03

Reduced interference between target and non-target concepts

Abstract

Unlearning specific concepts in text-to-image diffusion models has become increasingly important for preventing undesirable content generation. Among prior approaches, sparse autoencoder (SAE)-based methods have attracted attention due to their ability to suppress target concepts through lightweight manipulation of latent features, without modifying model parameters. However, SAEs trained with sparse reconstruction objectives do not explicitly enforce concept-wise separation, resulting in shared latent features across concepts. To address this, we propose SAEParate, which organizes latent representations into concept-specific clusters via a concept-aware contrastive objective, enabling more precise concept suppression while reducing unintended interference during unlearning. In addition, we enhance the encoder with a GeLU-based nonlinear transformation to increase its expressive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.