STENet: Superpixel Token Enhancing Network for RGB-D Salient Object Detection
Jianlin Chen, Gongyang Li, Zhijiang Zhang, Liang Chang, Dan Zeng

TL;DR
STENet introduces superpixel-based modules into a transformer framework for RGB-D salient object detection, improving global and local feature extraction while reducing computational complexity.
Contribution
The paper proposes a novel superpixel token approach with enhanced global and local modules, addressing attention complexity and local detail extraction in RGB-D SOD.
Findings
Achieves competitive results on seven RGB-D SOD datasets.
Reduces attention complexity by modeling region-level relationships.
Effectively captures local details through superpixel-based refinement.
Abstract
Transformer-based methods for RGB-D Salient Object Detection (SOD) have gained significant interest, owing to the transformer's exceptional capacity to capture long-range pixel dependencies. Nevertheless, current RGB-D SOD methods face challenges, such as the quadratic complexity of the attention mechanism and the limited local detail extraction. To overcome these limitations, we propose a novel Superpixel Token Enhancing Network (STENet), which introduces superpixels into cross-modal interaction. STENet follows the two-stream encoder-decoder structure. Its cores are two tailored superpixel-driven cross-modal interaction modules, responsible for global and local feature enhancement. Specifically, we update the superpixel generation method by expanding the neighborhood range of each superpixel, allowing for flexible transformation between pixels and superpixels. With the updated…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVisual Attention and Saliency Detection · Advanced Neural Network Applications · Image and Video Quality Assessment
