Modality-Guided Subnetwork for Salient Object Detection
Zongwei Wu, Guillaume Allibert, Christophe Stolz, Chao Ma, C\'edric, Demonceaux

TL;DR
This paper introduces MGSnet, a flexible and lightweight model for salient object detection that can operate on RGB and RGBD data, estimating depth when unavailable, achieving real-time performance with state-of-the-art results.
Contribution
The paper presents a novel modality-guided subnetwork that estimates depth from RGB, enabling real-time RGB saliency detection and easy integration into RGBD models with minimal additional cost.
Findings
Achieves state-of-the-art performance on RGB SOD tasks.
Operates in real-time with minimal computational overhead.
Easily integrates into existing RGBD models for enhanced performance.
Abstract
Recent RGBD-based models for saliency detection have attracted research attention. The depth clues such as boundary clues, surface normal, shape attribute, etc., contribute to the identification of salient objects with complicated scenarios. However, most RGBD networks require multi-modalities from the input side and feed them separately through a two-stream design, which inevitably results in extra costs on depth sensors and computation. To tackle these inconveniences, we present in this paper a novel fusion design named modality-guided subnetwork (MGSnet). It has the following superior designs: 1) Our model works for both RGB and RGBD data, and dynamically estimating depth if not available. Taking the inner workings of depth-prediction networks into account, we propose to estimate the pseudo-geometry maps from RGB input - essentially mimicking the multi-modality input. 2) Our MGSnet…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVisual Attention and Saliency Detection · Advanced Neural Network Applications · Image and Video Quality Assessment
