BTS-Net: Bi-directional Transfer-and-Selection Network For RGB-D Salient Object Detection
Wenbo Zhang, Yao Jiang, Keren Fu, Qijun Zhao

TL;DR
BTS-Net introduces a bi-directional transfer-and-selection mechanism in RGB-D salient object detection, enhancing feature robustness and accuracy despite low-quality depth maps, outperforming recent methods.
Contribution
The paper proposes a novel bi-directional transfer-and-selection network with early cross-modal interactions for improved RGB-D SOD.
Findings
Outperforms 16 state-of-the-art methods on six datasets
Uses bi-directional interactions to enhance feature robustness
Employs a lightweight decoder for accurate saliency prediction
Abstract
Depth information has been proved beneficial in RGB-D salient object detection (SOD). However, depth maps obtained often suffer from low quality and inaccuracy. Most existing RGB-D SOD models have no cross-modal interactions or only have unidirectional interactions from depth to RGB in their encoder stages, which may lead to inaccurate encoder features when facing low quality depth. To address this limitation, we propose to conduct progressive bi-directional interactions as early in the encoder stage, yielding a novel bi-directional transfer-and-selection network named BTS-Net, which adopts a set of bi-directional transfer-and-selection (BTS) modules to purify features during encoding. Based on the resulting robust encoder features, we also design an effective light-weight group decoder to achieve accurate final saliency prediction. Comprehensive experiments on six widely used datasets…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVisual Attention and Saliency Detection · Image and Video Quality Assessment · Olfactory and Sensory Function Studies
