Beyond Global Scanning: Adaptive Visual State Space Modeling for Salient Object Detection in Optical Remote Sensing Images
Mengyu Ren, Yutong Li, Hua Li, Chuhong Wang, Runmin Cong

TL;DR
This paper introduces ASCNet, an adaptive visual state space model that effectively captures multi-scale features and local-global dependencies for improved salient object detection in optical remote sensing images.
Contribution
The paper proposes a novel adaptive state space network with multi-level context and patchwise modules, enhancing feature integration and local modeling for remote sensing SOD.
Findings
Achieves state-of-the-art performance on benchmark datasets.
Effectively captures multi-scale and local features.
Outperforms existing methods in accuracy and robustness.
Abstract
Salient object detection (SOD) in optical remote sensing images (ORSIs) faces numerous challenges, including significant variations in target scales and low contrast between targets and the background. Existing methods based on vision transformers (ViTs) and convolutional neural networks (CNNs) architectures aim to leverage both global and local features, but the difficulty in effectively integrating these heterogeneous features limits their overall performance. To overcome these limitations, we propose an adaptive state space context network (ASCNet), which builds upon the state space model mechanism to simultaneously capture long-range dependencies and enhance regional feature representation. Specifically, we employ the visual state space encoder to extract multi-scale features. To further achieve deep guidance and enhancement of these features, we design a Multi-Level Context Module…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
