Unsupervised Semantic Correspondence Using Stable Diffusion

Eric Hedlin; Gopal Sharma; Shweta Mahajan; Hossam Isack; Abhishek Kar,; Andrea Tagliasacchi; Kwang Moo Yi

arXiv:2305.15581·cs.CV·December 29, 2023·22 cites

Unsupervised Semantic Correspondence Using Stable Diffusion

Eric Hedlin, Gopal Sharma, Shweta Mahajan, Hossam Isack, Abhishek Kar,, Andrea Tagliasacchi, Kwang Moo Yi

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper demonstrates that pre-trained text-to-image diffusion models can be used in an unsupervised manner to find semantic correspondences across images by optimizing prompt embeddings, achieving competitive results without additional training.

Contribution

The authors introduce a novel unsupervised method leveraging diffusion models' semantic understanding to find image correspondences without training.

Findings

01

Achieves state-of-the-art results on PF-Willow dataset.

02

Outperforms existing weakly and unsupervised methods on multiple datasets.

03

Uses optimized prompt embeddings to capture semantic regions.

Abstract

Text-to-image diffusion models are now capable of generating images that are often indistinguishable from real images. To generate such images, these models must understand the semantics of the objects they are asked to generate. In this work we show that, without any training, one can leverage this semantic knowledge within diffusion models to find semantic correspondences - locations in multiple images that have the same semantic meaning. Specifically, given an image, we optimize the prompt embeddings of these models for maximum attention on the regions of interest. These optimized embeddings capture semantic information about the location, which can then be transferred to another image. By doing so we obtain results on par with the strongly supervised state of the art on the PF-Willow dataset and significantly outperform (20.9% relative for the SPair-71k dataset) any existing weakly…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ubc-vision/LDM_correspondences
pytorchOfficial

Videos

Unsupervised Semantic Correspondence Using Stable Diffusion· slideslive

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis

MethodsDiffusion