VisIRNet: Deep Image Alignment for UAV-taken Visible and Infrared Image   Pairs

Sedat Ozer; Alain P. Ndigande

arXiv:2402.09635·cs.CV·February 16, 2024·1 cites

VisIRNet: Deep Image Alignment for UAV-taken Visible and Infrared Image Pairs

Sedat Ozer, Alain P. Ndigande

PDF

Open Access 1 Repo

TL;DR

VisIRNet introduces a deep learning method for aligning UAV-captured visible and infrared images without relying on traditional Lucas-Kanade techniques, achieving state-of-the-art accuracy through a two-branch CNN approach.

Contribution

The paper presents a novel CNN-based multi-modal image alignment method that predicts corner coordinates or homography directly, outperforming LK-based methods on aerial datasets.

Findings

01

Achieves state-of-the-art alignment accuracy on four aerial datasets.

02

Outperforms existing deep LK-based architectures.

03

Uses a two-branch CNN with feature embedding for robust alignment.

Abstract

This paper proposes a deep learning based solution for multi-modal image alignment regarding UAV-taken images. Many recently proposed state-of-the-art alignment techniques rely on using Lucas-Kanade (LK) based solutions for a successful alignment. However, we show that we can achieve state of the art results without using LK-based methods. Our approach carefully utilizes a two-branch based convolutional neural network (CNN) based on feature embedding blocks. We propose two variants of our approach, where in the first variant (ModelA), we directly predict the new coordinates of only the four corners of the image to be aligned; and in the second one (ModelB), we predict the homography matrix directly. Applying alignment on the image corners forces algorithm to match only those four corners as opposed to computing and matching many (key)points, since the latter may cause many outliers,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ozerlabs-proxy/VisIrNet
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Robotics and Sensor-Based Localization · Advanced Image and Video Retrieval Techniques