A Unified Structure for Efficient RGB and RGB-D Salient Object Detection

Peng Peng; Yong-Jie Li

arXiv:2012.00437·cs.CV·October 30, 2024·1 cites

A Unified Structure for Efficient RGB and RGB-D Salient Object Detection

Peng Peng, Yong-Jie Li

PDF

Open Access

TL;DR

This paper introduces a unified, efficient neural network structure with a cross-attention module that effectively handles both RGB and RGB-D salient object detection tasks, outperforming existing methods.

Contribution

The paper presents a novel unified architecture with a cross-attention context extraction module that efficiently fuses RGB and depth information for salient object detection.

Findings

01

Outperforms state-of-the-art methods on multiple datasets

02

Effectively fuses RGB and depth data with a unified network

03

Achieves superior metrics in both RGB and RGB-D SOD tasks

Abstract

Salient object detection (SOD) has been well studied in recent years, especially using deep neural networks. However, SOD with RGB and RGB-D images is usually treated as two different tasks with different network structures that need to be designed specifically. In this paper, we proposed a unified and efficient structure with a cross-attention context extraction (CRACE) module to address both tasks of SOD efficiently. The proposed CRACE module receives and appropriately fuses two (for RGB SOD) or three (for RGB-D SOD) inputs. The simple unified feature pyramid network (FPN)-like structure with CRACE modules conveys and refines the results under the multi-level supervisions of saliency and boundaries. The proposed structure is simple yet effective; the rich context information of RGB and depth can be appropriately extracted and fused by the proposed structure efficiently. Experimental…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVisual Attention and Saliency Detection · Gaze Tracking and Assistive Technology · Face Recognition and Perception