Beyond Global Scanning: Adaptive Visual State Space Modeling for Salient Object Detection in Optical Remote Sensing Images

Mengyu Ren; Yutong Li; Hua Li; Chuhong Wang; Runmin Cong

arXiv:2508.10542·cs.CV·February 5, 2026

Beyond Global Scanning: Adaptive Visual State Space Modeling for Salient Object Detection in Optical Remote Sensing Images

Mengyu Ren, Yutong Li, Hua Li, Chuhong Wang, Runmin Cong

PDF

TL;DR

This paper introduces ASCNet, an adaptive visual state space model that effectively captures multi-scale features and local-global dependencies for improved salient object detection in optical remote sensing images.

Contribution

The paper proposes a novel adaptive state space network with multi-level context and patchwise modules, enhancing feature integration and local modeling for remote sensing SOD.

Findings

01

Achieves state-of-the-art performance on benchmark datasets.

02

Effectively captures multi-scale and local features.

03

Outperforms existing methods in accuracy and robustness.

Abstract

Salient object detection (SOD) in optical remote sensing images (ORSIs) faces numerous challenges, including significant variations in target scales and low contrast between targets and the background. Existing methods based on vision transformers (ViTs) and convolutional neural networks (CNNs) architectures aim to leverage both global and local features, but the difficulty in effectively integrating these heterogeneous features limits their overall performance. To overcome these limitations, we propose an adaptive state space context network (ASCNet), which builds upon the state space model mechanism to simultaneously capture long-range dependencies and enhance regional feature representation. Specifically, we employ the visual state space encoder to extract multi-scale features. To further achieve deep guidance and enhancement of these features, we design a Multi-Level Context Module…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.