Alignment-Free RGB-T Salient Object Detection: A Large-scale Dataset and Progressive Correlation Network
Kunpeng Wang, Keke Chen, Chenglong Li, Zhengzheng Tu, Bin Luo

TL;DR
This paper introduces UVT20K, a large-scale unaligned RGB-T dataset with diverse challenging scenes and annotations, and proposes PCNet, a network that models correlations to improve salient object detection without alignment.
Contribution
The paper presents a new large-scale unaligned RGB-T dataset and a novel Progressive Correlation Network for improved alignment-free salient object detection.
Findings
UVT20K contains 20,000 image pairs across 407 scenes and 1256 categories.
PCNet achieves state-of-the-art performance on unaligned RGB-T SOD tasks.
The dataset and method facilitate research in alignment-free multimodal detection.
Abstract
Alignment-free RGB-Thermal (RGB-T) salient object detection (SOD) aims to achieve robust performance in complex scenes by directly leveraging the complementary information from unaligned visible-thermal image pairs, without requiring manual alignment. However, the labor-intensive process of collecting and annotating image pairs limits the scale of existing benchmarks, hindering the advancement of alignment-free RGB-T SOD. In this paper, we construct a large-scale and high-diversity unaligned RGB-T SOD dataset named UVT20K, comprising 20,000 image pairs, 407 scenes, and 1256 object categories. All samples are collected from real-world scenarios with various challenges, such as low illumination, image clutter, complex salient objects, and so on. To support the exploration for further research, each sample in UVT20K is annotated with a comprehensive set of ground truths, including saliency…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVisual Attention and Saliency Detection · Advanced Image Fusion Techniques · Aesthetic Perception and Analysis
MethodsSparse Evolutionary Training
