LoFTR: Detector-Free Local Feature Matching with Transformers

Jiaming Sun; Zehong Shen; Yuang Wang; Hujun Bao; Xiaowei Zhou

arXiv:2104.00680·cs.CV·April 2, 2021·75 cites

LoFTR: Detector-Free Local Feature Matching with Transformers

Jiaming Sun, Zehong Shen, Yuang Wang, Hujun Bao, Xiaowei Zhou

PDF

Open Access 4 Repos

TL;DR

LoFTR introduces a transformer-based, detector-free approach for dense local feature matching that excels in low-texture areas and outperforms existing methods on multiple benchmarks.

Contribution

It proposes a novel dense matching method using Transformers without traditional feature detection, improving performance in challenging low-texture regions.

Findings

01

Outperforms state-of-the-art methods on indoor and outdoor datasets.

02

Ranks first on two public visual localization benchmarks.

03

Effectively produces dense matches in low-texture areas.

Abstract

We present a novel method for local image feature matching. Instead of performing image feature detection, description, and matching sequentially, we propose to first establish pixel-wise dense matches at a coarse level and later refine the good matches at a fine level. In contrast to dense methods that use a cost volume to search correspondences, we use self and cross attention layers in Transformer to obtain feature descriptors that are conditioned on both images. The global receptive field provided by Transformer enables our method to produce dense matches in low-texture areas, where feature detectors usually struggle to produce repeatable interest points. The experiments on indoor and outdoor datasets show that LoFTR outperforms state-of-the-art methods by a large margin. LoFTR also ranks first on two public benchmarks of visual localization among the published methods.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Robotics and Sensor-Based Localization · Multimodal Machine Learning Applications

MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Softmax · Dense Connections · Attention Is All You Need · Dropout · Residual Connection · Byte Pair Encoding · Layer Normalization