Which Way from B to A: The role of embedding geometry in image interpolation for Stable Diffusion
Nicholas Karris, Luke Durell, Javier Flores, Tegan Emerson

TL;DR
This paper introduces a novel approach to image interpolation in Stable Diffusion by interpreting CLIP embeddings as point clouds in Wasserstein space and using optimal transport to produce smoother, more coherent intermediate images.
Contribution
It proposes a new geometric perspective on CLIP embeddings and applies optimal transport to improve image interpolation quality in generative models.
Findings
Optimal transport-based interpolation yields smoother images.
Viewing embeddings as point clouds enhances geometric understanding.
The method outperforms standard interpolation techniques.
Abstract
It can be shown that Stable Diffusion has a permutation-invariance property with respect to the rows of Contrastive Language-Image Pretraining (CLIP) embedding matrices. This inspired the novel observation that these embeddings can naturally be interpreted as point clouds in a Wasserstein space rather than as matrices in a Euclidean space. This perspective opens up new possibilities for understanding the geometry of embedding space. For example, when interpolating between embeddings of two distinct prompts, we propose reframing the interpolation problem as an optimal transport problem. By solving this optimal transport problem, we compute a shortest path (or geodesic) between embeddings that captures a more natural and geometrically smooth transition through the embedding space. This results in smoother and more coherent intermediate (interpolated) images when rendered by the Stable…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Medical Image Segmentation Techniques · Advanced Image Processing Techniques
