HODINet: High-Order Discrepant Interaction Network for RGB-D Salient   Object Detection

Kang Yi; Jing Xu; Xiao Jin; Fu Guo; Yan-Feng Wu

arXiv:2307.00954·cs.CV·July 4, 2023

HODINet: High-Order Discrepant Interaction Network for RGB-D Salient Object Detection

Kang Yi, Jing Xu, Xiao Jin, Fu Guo, Yan-Feng Wu

PDF

Open Access

TL;DR

HODINet introduces a novel high-order discrepant interaction network that effectively models and fuses RGB and depth features at multiple stages for improved salient object detection.

Contribution

The paper proposes a new high-order interaction framework with specialized fusion modules and a cascaded decoding process for RGB-D SOD, addressing feature discrepancy issues.

Findings

01

Achieves competitive performance on seven datasets.

02

Outperforms 24 state-of-the-art methods.

03

Effective high-order feature fusion improves detection accuracy.

Abstract

RGB-D salient object detection (SOD) aims to detect the prominent regions by jointly modeling RGB and depth information. Most RGB-D SOD methods apply the same type of backbones and fusion modules to identically learn the multimodality and multistage features. However, these features contribute differently to the final saliency results, which raises two issues: 1) how to model discrepant characteristics of RGB images and depth maps; 2) how to fuse these cross-modality features in different stages. In this paper, we propose a high-order discrepant interaction network (HODINet) for RGB-D SOD. Concretely, we first employ transformer-based and CNN-based architectures as backbones to encode RGB and depth features, respectively. Then, the high-order representations are delicately extracted and embedded into spatial and channel attentions for cross-modality feature fusion in different stages.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVisual Attention and Saliency Detection · Virtual Reality Applications and Impacts · Face Recognition and Perception