OmniGlue: Generalizable Feature Matching with Foundation Model Guidance

Hanwen Jiang; Arjun Karpur; Bingyi Cao; Qixing Huang; Andre Araujo

arXiv:2405.12979·cs.CV·May 22, 2024·2 cites

OmniGlue: Generalizable Feature Matching with Foundation Model Guidance

Hanwen Jiang, Arjun Karpur, Bingyi Cao, Qixing Huang, Andre Araujo

PDF

Open Access 1 Repo

TL;DR

OmniGlue is a novel learnable image matching method that significantly improves generalization across diverse and unseen image domains by leveraging foundation model guidance and a new attention mechanism.

Contribution

It introduces OmniGlue, the first learnable matcher designed specifically for strong domain generalization using foundation model guidance and a keypoint position-guided attention.

Findings

01

20.9% improvement on unseen domains

02

Outperforms LightGlue by 9.5%

03

Effective across diverse image datasets

Abstract

The image matching field has been witnessing a continuous emergence of novel learnable feature matching techniques, with ever-improving performance on conventional benchmarks. However, our investigation shows that despite these gains, their potential for real-world applications is restricted by their limited generalization capabilities to novel image domains. In this paper, we introduce OmniGlue, the first learnable image matcher that is designed with generalization as a core principle. OmniGlue leverages broad knowledge from a vision foundation model to guide the feature matching process, boosting generalization to domains not seen at training time. Additionally, we propose a novel keypoint position-guided attention mechanism which disentangles spatial and appearance information, leading to enhanced matching descriptors. We perform comprehensive experiments on a suite of $7$ datasets…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

google-research/omniglue
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Human Pose and Action Recognition · Video Analysis and Summarization