Lightweight RGB-D Salient Object Detection from a Speed-Accuracy Tradeoff Perspective
Songsong Duan, Xi Yang, Nannan Wang, Xinbo Gao

TL;DR
SATNet is a lightweight RGB-D salient object detection network that balances speed and accuracy by enhancing depth quality, modality fusion, and feature representation, outperforming heavyweight models while maintaining high efficiency.
Contribution
The paper introduces SATNet, a novel lightweight RGB-D SOD framework with modules for high-quality depth generation, decoupled modality fusion, and enriched feature representation, achieving state-of-the-art results.
Findings
Outperforms state-of-the-art CNN-based models.
Achieves 415 FPS with only 5.2 million parameters.
Demonstrates superior accuracy on five public datasets.
Abstract
Current RGB-D methods usually leverage large-scale backbones to improve accuracy but sacrifice efficiency. Meanwhile, several existing lightweight methods are difficult to achieve high-precision performance. To balance the efficiency and performance, we propose a Speed-Accuracy Tradeoff Network (SATNet) for Lightweight RGB-D SOD from three fundamental perspectives: depth quality, modality fusion, and feature representation. Concerning depth quality, we introduce the Depth Anything Model to generate high-quality depth maps,which effectively alleviates the multi-modal gaps in the current datasets. For modality fusion, we propose a Decoupled Attention Module (DAM) to explore the consistency within and between modalities. Here, the multi-modal features are decoupled into dual-view feature vectors to project discriminable information of feature maps. For feature representation, we develop a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVisual Attention and Saliency Detection · Advanced Neural Network Applications · Advanced Image and Video Retrieval Techniques
MethodsSoftmax · Attention Is All You Need
