MatchAnything: Universal Cross-Modality Image Matching with Large-Scale   Pre-Training

Xingyi He; Hao Yu; Sida Peng; Dongli Tan; Zehong Shen; Hujun Bao,; Xiaowei Zhou

arXiv:2501.07556·cs.CV·January 14, 2025

MatchAnything: Universal Cross-Modality Image Matching with Large-Scale Pre-Training

Xingyi He, Hao Yu, Sida Peng, Dongli Tan, Zehong Shen, Hujun Bao,, Xiaowei Zhou

PDF

1 Models

TL;DR

This paper introduces a large-scale pre-training framework for cross-modality image matching, enabling models to generalize across diverse imaging modalities and outperform existing methods in multiple unseen tasks.

Contribution

The authors propose a synthetic cross-modal training approach that significantly improves the generalization of image matching models across various modalities.

Findings

01

Model trained with our framework generalizes well to over eight unseen cross-modality tasks.

02

Our approach outperforms existing generalization and task-specific methods.

03

The method enhances multi-modality analysis in scientific and AI applications.

Abstract

Image matching, which aims to identify corresponding pixel locations between images, is crucial in a wide range of scientific disciplines, aiding in image registration, fusion, and analysis. In recent years, deep learning-based image matching algorithms have dramatically outperformed humans in rapidly and accurately finding large amounts of correspondences. However, when dealing with images captured under different imaging modalities that result in significant appearance changes, the performance of these algorithms often deteriorates due to the scarcity of annotated cross-modal training data. This limitation hinders applications in various fields that rely on multiple image modalities to obtain complementary information. To address this challenge, we propose a large-scale pre-training framework that utilizes synthetic cross-modal training signals, incorporating diverse data from various…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
zju-community/matchanything_eloftr
model· 2.8k dl· ♡ 82
2.8k dl♡ 82

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.