Isometric Representation Learning for Disentangled Latent Space of Diffusion Models
Jaehoon Hahm, Junho Lee, Sunghyun Kim, Joonseok Lee

TL;DR
This paper introduces Isometric Diffusion, a method that adds a geometric regularizer to diffusion models to learn a more disentangled and geometrically meaningful latent space, improving interpolation, inversion, and attribute control.
Contribution
It proposes a novel geometric regularizer for diffusion models to achieve a more disentangled and geometrically sound latent space, which was previously unexplored.
Findings
Enhanced image interpolation quality
More accurate image inversion results
Improved attribute control in latent space
Abstract
The latent space of diffusion model mostly still remains unexplored, despite its great success and potential in the field of generative modeling. In fact, the latent space of existing diffusion models are entangled, with a distorted mapping from its latent space to image space. To tackle this problem, we present Isometric Diffusion, equipping a diffusion model with a geometric regularizer to guide the model to learn a geometrically sound latent space of the training data manifold. This approach allows diffusion models to learn a more disentangled latent space, which enables smoother interpolation, more accurate inversion, and more precise control over attributes directly in the latent space. Our extensive experiments consisting of image interpolations, image inversions, and linear editing show the effectiveness of our method.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Speech Recognition and Synthesis · Generative Adversarial Networks and Image Synthesis
MethodsDiffusion
