ATR-UMMIM: A Benchmark Dataset for UAV-Based Multimodal Image Registration under Complex Imaging Conditions

Kangcheng Bin; Chen Chen; Ting Hu; Jiahao Qi; and Ping Zhong

arXiv:2507.20764·cs.CV·July 29, 2025

ATR-UMMIM: A Benchmark Dataset for UAV-Based Multimodal Image Registration under Complex Imaging Conditions

Kangcheng Bin, Chen Chen, Ting Hu, Jiahao Qi, and Ping Zhong

PDF

TL;DR

ATR-UMMIM is the first comprehensive benchmark dataset for multimodal image registration in UAV-based aerial scenarios, facilitating the development of robust registration methods under diverse real-world conditions.

Contribution

This paper introduces ATR-UMMIM, a large-scale, multi-scenario dataset with high-quality annotations for multimodal registration in UAV applications, filling a critical gap in existing resources.

Findings

01

Provides 7,969 triplets of registered visible, infrared, and RGB images.

02

Includes diverse scenarios with varying altitude, angles, and weather conditions.

03

Enables benchmarking of registration robustness and downstream perception tasks.

Abstract

Multimodal fusion has become a key enabler for UAV-based object detection, as each modality provides complementary cues for robust feature extraction. However, due to significant differences in resolution, field of view, and sensing characteristics across modalities, accurate registration is a prerequisite before fusion. Despite its importance, there is currently no publicly available benchmark specifically designed for multimodal registration in UAV-based aerial scenarios, which severely limits the development and evaluation of advanced registration methods under real-world conditions. To bridge this gap, we present ATR-UMMIM, the first benchmark dataset specifically tailored for multimodal image registration in UAV-based applications. This dataset includes 7,969 triplets of raw visible, infrared, and precisely registered visible images captured covers diverse scenarios including…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.