TL;DR
This paper introduces QSF-Net, a novel V-D-T salient object detection model that adaptively fuses multi-modal data based on quality assessment, improving detection accuracy in challenging scenarios.
Contribution
The paper proposes a quality-aware selective fusion network with a new region selection mechanism and attention modules for enhanced V-D-T salient object detection.
Findings
Outperforms existing methods on VDT-2048 dataset
Effectively identifies high-quality regions for fusion
Refines object edges through attention modules
Abstract
Depth images and thermal images contain the spatial geometry information and surface temperature information, which can act as complementary information for the RGB modality. However, the quality of the depth and thermal images is often unreliable in some challenging scenarios, which will result in the performance degradation of the two-modal based salient object detection (SOD). Meanwhile, some researchers pay attention to the triple-modal SOD task, where they attempt to explore the complementarity of the RGB image, the depth image, and the thermal image. However, existing triple-modal SOD methods fail to perceive the quality of depth maps and thermal images, which leads to performance degradation when dealing with scenes with low-quality depth and thermal images. Therefore, we propose a quality-aware selective fusion network (QSF-Net) to conduct VDT salient object detection, which…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
