RGB-D Salient Object Detection Based on Discriminative Cross-modal Transfer Learning
Hao Chen, Y.F. Li, and Dan Su

TL;DR
This paper introduces a CNN-based cross-modal transfer learning approach to improve depth-induced salient object detection by leveraging RGB data and pre-training strategies, achieving significant performance gains.
Contribution
It presents a novel pre-training framework that effectively transfers knowledge from RGB to depth modalities for salient object detection.
Findings
Significant improvement over state-of-the-art methods
Effective use of auxiliary RGB data for depth detection
Pre-training strategy enhances discriminative feature learning
Abstract
In this work, we propose to utilize Convolutional Neural Networks to boost the performance of depth-induced salient object detection by capturing the high-level representative features for depth modality. We formulate the depth-induced saliency detection as a CNN-based cross-modal transfer problem to bridge the gap between the "data-hungry" nature of CNNs and the unavailability of sufficient labeled training data in depth modality. In the proposed approach, we leverage the auxiliary data from the source modality effectively by training the RGB saliency detection network to obtain the task-specific pre-understanding layers for the target modality. Meanwhile, we exploit the depth-specific information by pre-training a modality classification network that encourages modal-specific representations during the optimizing course. Thus, it could make the feature representations of the RGB and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVisual Attention and Saliency Detection · Advanced Image and Video Retrieval Techniques · Olfactory and Sensory Function Studies
