FMRT: Learning Accurate Feature Matching with Reconciliatory Transformer

Xinyu Zhang; Li Wang; Zhiqiang Jiang; Kun Dai; Tao Xie; Lei Yang,; Wenhao Yu; Yang Shen; Jun Li

arXiv:2310.13605·cs.CV·October 23, 2023·1 cites

FMRT: Learning Accurate Feature Matching with Reconciliatory Transformer

Xinyu Zhang, Li Wang, Zhiqiang Jiang, Kun Dai, Tao Xie, Lei Yang,, Wenhao Yu, Yang Shen, Jun Li

PDF

Open Access

TL;DR

FMRT introduces a novel Transformer-based approach that adaptively reconciles multi-scale features and enhances positional encoding, significantly improving local feature matching accuracy in various computer vision tasks.

Contribution

The paper presents FMRT, a new detector-free Transformer method with a Reconciliatory Transformer that adaptively integrates multi-scale features and reliable positional encoding.

Findings

01

FMRT outperforms existing methods on multiple benchmarks.

02

It achieves higher accuracy in pose estimation and visual localization.

03

The approach demonstrates robustness across diverse computer vision tasks.

Abstract

Local Feature Matching, an essential component of several computer vision tasks (e.g., structure from motion and visual localization), has been effectively settled by Transformer-based methods. However, these methods only integrate long-range context information among keypoints with a fixed receptive field, which constrains the network from reconciling the importance of features with different receptive fields to realize complete image perception, hence limiting the matching accuracy. In addition, these methods utilize a conventional handcrafted encoding approach to integrate the positional information of keypoints into the visual descriptors, which limits the capability of the network to extract reliable positional encoding message. In this study, we propose Feature Matching with Reconciliatory Transformer (FMRT), a novel Transformer-based detector-free method that reconciles different…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Robotics and Sensor-Based Localization · Advanced Neural Network Applications

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Softmax · Position-Wise Feed-Forward Layer · Dense Connections · Residual Connection · Absolute Position Encodings · Adam · Byte Pair Encoding