TP3M: Transformer-based Pseudo 3D Image Matching with Reference Image

Liming Han; Zhaoxiang Liu; Shiguo Lian

arXiv:2405.08434·cs.CV·August 13, 2024

TP3M: Transformer-based Pseudo 3D Image Matching with Reference Image

Liming Han, Zhaoxiang Liu, Shiguo Lian

PDF

TL;DR

This paper introduces a Transformer-based pseudo 3D image matching method that leverages a reference image to enhance feature descriptors, significantly improving matching accuracy in challenging scenes with large viewpoint or illumination changes.

Contribution

It proposes a novel pseudo 3D matching approach using a reference image to upgrade 2D features to 3D, enhancing matching performance in difficult scenarios.

Findings

01

Achieves state-of-the-art results in homography estimation

02

Improves pose estimation accuracy in challenging scenes

03

Enhances visual localization performance

Abstract

Image matching is still challenging in such scenes with large viewpoints or illumination changes or with low textures. In this paper, we propose a Transformer-based pseudo 3D image matching method. It upgrades the 2D features extracted from the source image to 3D features with the help of a reference image and matches to the 2D features extracted from the destination image by the coarse-to-fine 3D matching. Our key discovery is that by introducing the reference image, the source image's fine points are screened and furtherly their feature descriptors are enriched from 2D to 3D, which improves the match performance with the destination image. Experimental results on multiple datasets show that the proposed method achieves the state-of-the-art on the tasks of homography estimation, pose estimation and visual localization especially in challenging scenes.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.