Unsupervised Discovery of Semantic Latent Directions in Diffusion Models
Yong-Hyun Park, Mingi Kwon, Junghyo Jo, Youngjung Uh

TL;DR
This paper introduces an unsupervised approach to identify interpretable semantic directions in the latent space of diffusion models, enhancing understanding and editing capabilities without supervision.
Contribution
It proposes a novel Riemannian geometry-based method to discover semantic directions in diffusion models' latent space, revealing disentangled attributes and their geometric properties.
Findings
Semantic directions yield disentangled attribute changes
Editing at different timesteps affects different attribute levels
Method is effective across various datasets and models
Abstract
Despite the success of diffusion models (DMs), we still lack a thorough understanding of their latent space. While image editing with GANs builds upon latent space, DMs rely on editing the conditions such as text prompts. We present an unsupervised method to discover interpretable editing directions for the latent variables of DMs. Our method adopts Riemannian geometry between and the intermediate feature maps of the U-Nets to provide a deep understanding over the geometrical structure of . The discovered semantic latent directions mostly yield disentangled attribute changes, and they are globally consistent across different samples. Furthermore, editing in earlier timesteps edits coarse attributes, while ones in later timesteps focus on high-frequency details. We define the curvedness of a line segment between…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Neuroimaging Techniques and Applications
MethodsDiffusion
