XoFTR: Cross-modal Feature Matching Transformer

\"Onder Tuzcuo\u{g}lu; Aybora K\"oksal; Bu\u{g}ra Sofu; Sinan Kalkan,; A. Ayd{\i}n Alatan

arXiv:2404.09692·cs.CV·April 16, 2024·1 cites

XoFTR: Cross-modal Feature Matching Transformer

\"Onder Tuzcuo\u{g}lu, Aybora K\"oksal, Bu\u{g}ra Sofu, Sinan Kalkan,, A. Ayd{\i}n Alatan

PDF

Open Access 1 Repo

TL;DR

XoFTR is a novel transformer-based approach for local feature matching between thermal infrared and visible images, effectively handling modality differences and viewpoint variations through pre-training, augmentation, and refined matching techniques.

Contribution

The paper introduces XoFTR, a cross-modal feature matching transformer that incorporates masked image modeling, pseudo-thermal augmentation, and a refined matching pipeline for improved thermal-visible image matching.

Findings

01

Outperforms existing methods on multiple benchmarks

02

Effective handling of modality, viewpoint, and scale differences

03

Provides a new comprehensive thermal-visible dataset

Abstract

We introduce, XoFTR, a cross-modal cross-view method for local feature matching between thermal infrared (TIR) and visible images. Unlike visible images, TIR images are less susceptible to adverse lighting and weather conditions but present difficulties in matching due to significant texture and intensity differences. Current hand-crafted and learning-based methods for visible-TIR matching fall short in handling viewpoint, scale, and texture diversities. To address this, XoFTR incorporates masked image modeling pre-training and fine-tuning with pseudo-thermal image augmentation to handle the modality differences. Additionally, we introduce a refined matching pipeline that adjusts for scale discrepancies and enhances match reliability through sub-pixel level refinement. To validate our approach, we collect a comprehensive visible-thermal dataset, and show that our method outperforms…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ondert/xoftr
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Advanced Neural Network Applications · Multimodal Machine Learning Applications