RGB-D Salient Object Detection with Ubiquitous Target Awareness
Yifan Zhao, Jiawei Zhao, Jia Li, Xiaowu Chen

TL;DR
This paper introduces a novel RGB-D salient object detection framework that uses depth data during training for supervision but operates solely on RGB data during testing, achieving real-time performance and surpassing state-of-the-art methods.
Contribution
The authors propose a depth-awareness framework with a UTA network that learns from depth data during training and performs RGB-only salient detection at inference, addressing depth data unavailability issues.
Findings
Outperforms state-of-the-art on five RGB-D benchmarks
Operates in real-time at 43 FPS
Effective cross-modal and cross-level feature interactions
Abstract
Conventional RGB-D salient object detection methods aim to leverage depth as complementary information to find the salient regions in both modalities. However, the salient object detection results heavily rely on the quality of captured depth data which sometimes are unavailable. In this work, we make the first attempt to solve the RGB-D salient object detection problem with a novel depth-awareness framework. This framework only relies on RGB data in the testing phase, utilizing captured depth data as supervision for representation learning. To construct our framework as well as achieving accurate salient detection results, we propose a Ubiquitous Target Awareness (UTA) network to solve three important challenges in RGB-D SOD task: 1) a depth awareness module to excavate depth information and to mine ambiguous regions via adaptive depth-error weights, 2) a spatial-aware cross-modal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
