Visible-Thermal Tiny Object Detection: A Benchmark Dataset and Baselines

Xinyi Ying; Chao Xiao; Ruojing Li; Xu He; Boyang Li; Xu Cao; Zhaoxu; Li; Yingqian Wang; Mingyuan Hu; Qingyu Xu; Zaiping Lin; Miao Li; Shilin Zhou,; Wei An; Weidong Sheng; Li Liu

arXiv:2406.14482·cs.CV·February 21, 2025

Visible-Thermal Tiny Object Detection: A Benchmark Dataset and Baselines

Xinyi Ying, Chao Xiao, Ruojing Li, Xu He, Boyang Li, Xu Cao, Zhaoxu, Li, Yingqian Wang, Mingyuan Hu, Qingyu Xu, Zaiping Lin, Miao Li, Shilin Zhou,, Wei An, Weidong Sheng, Li Liu

PDF

Open Access 1 Repo

TL;DR

This paper introduces RGBT-Tiny, a large-scale, diverse benchmark dataset for visible-thermal small object detection, along with a new evaluation metric, to advance research in multi-modal small object detection and tracking.

Contribution

The paper presents the first large-scale RGBT small object detection dataset with high diversity and annotations, and proposes SAFit, a robust scale-adaptive evaluation measure.

Findings

01

Extensive evaluation of 23 algorithms on RGBT-Tiny.

02

SAFit provides more reliable performance assessment for small and large targets.

03

Benchmark facilitates future research in RGBT small object detection and tracking.

Abstract

Small object detection (SOD) has been a longstanding yet challenging task for decades, with numerous datasets and algorithms being developed. However, they mainly focus on either visible or thermal modality, while visible-thermal (RGBT) bimodality is rarely explored. Although some RGBT datasets have been developed recently, the insufficient quantity, limited category, misaligned images and large target size cannot provide an impartial benchmark to evaluate multi-category visible-thermal small object detection (RGBT SOD) algorithms. In this paper, we build the first large-scale benchmark with high diversity for RGBT SOD (namely RGBT-Tiny), including 115 paired sequences, 93K frames and 1.2M manual annotations. RGBT-Tiny contains abundant targets (7 categories) and high-diversity scenes (8 types that cover different illumination and density variations). Note that, over 81% of targets are…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

XinyiYing/RGBT-Tiny
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVisual Attention and Saliency Detection

MethodsFocus