Semantic Localization Guiding Segment Anything Model For Reference Remote Sensing Image Segmentation

Shuyang Li; Shuang Wang; Zhuangzhuang Sun; Jing Xiao

arXiv:2506.10503·cs.CV·June 13, 2025

Semantic Localization Guiding Segment Anything Model For Reference Remote Sensing Image Segmentation

Shuyang Li, Shuang Wang, Zhuangzhuang Sun, Jing Xiao

PDF

Open Access

TL;DR

This paper introduces PSLG-SAM, a two-stage framework for remote sensing image segmentation guided by semantic localization, reducing annotation needs and improving accuracy by decomposing the task into localization and segmentation.

Contribution

The paper proposes a novel two-stage RRSIS framework that leverages a train-free segmentation stage and a new dataset, enhancing performance and reducing annotation requirements.

Findings

01

Significant performance improvements over existing models.

02

The second stage is train-free, reducing annotation burden.

03

Effective handling of complex scenes in remote sensing images.

Abstract

The Reference Remote Sensing Image Segmentation (RRSIS) task generates segmentation masks for specified objects in images based on textual descriptions, which has attracted widespread attention and research interest. Current RRSIS methods rely on multi-modal fusion backbones and semantic segmentation heads but face challenges like dense annotation requirements and complex scene interpretation. To address these issues, we propose a framework named \textit{prompt-generated semantic localization guiding Segment Anything Model}(PSLG-SAM), which decomposes the RRSIS task into two stages: coarse localization and fine segmentation. In coarse localization stage, a visual grounding network roughly locates the text-described object. In fine segmentation stage, the coordinates from the first stage guide the Segment Anything Model (SAM), enhanced by a clustering-based foreground point generator and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Remote-Sensing Image Classification · Automated Road and Building Extraction

MethodsSoftmax · Attention Is All You Need