Gromov Wasserstein Optimal Transport for Semantic Correspondences

Francis Snelgar; Stephen Gould; Ming Xu; Liang Zheng; Akshay Asthana

arXiv:2602.03105·cs.CV·February 4, 2026

Gromov Wasserstein Optimal Transport for Semantic Correspondences

Francis Snelgar, Stephen Gould, Ming Xu, Liang Zheng, Akshay Asthana

PDF

Open Access

TL;DR

This paper introduces a Gromov Wasserstein optimal transport method to improve semantic correspondence in images, achieving higher efficiency and competitive accuracy compared to existing ensemble-based approaches that combine features from large foundation models.

Contribution

The authors replace standard nearest neighbor matching with a Gromov Wasserstein optimal transport algorithm, significantly boosting performance and efficiency in semantic correspondence tasks.

Findings

01

Boosts DINOv2 baseline performance

02

Competitive with state-of-the-art methods using SD features

03

Achieves 5-10x efficiency improvement

Abstract

Establishing correspondences between image pairs is a long studied problem in computer vision. With recent large-scale foundation models showing strong zero-shot performance on downstream tasks including classification and segmentation, there has been interest in using the internal feature maps of these models for the semantic correspondence task. Recent works observe that features from DINOv2 and Stable Diffusion (SD) are complementary, the former producing accurate but sparse correspondences, while the latter produces spatially consistent correspondences. As a result, current state-of-the-art methods for semantic correspondence involve combining features from both models in an ensemble. While the performance of these methods is impressive, they are computationally expensive, requiring evaluating feature maps from large-scale foundation models. In this work we take a different…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Neural Network Applications · Domain Adaptation and Few-Shot Learning