Prompting DirectSAM for Semantic Contour Extraction in Remote Sensing Images
Shiyu Miao, Delong Chen, Fan Liu, Chuanyi Zhang, Yanhui Gu, Shengjie, Guo, Jun Zhou

TL;DR
This paper introduces DirectSAM-RS, a foundation model for semantic contour extraction in remote sensing images, leveraging a large-scale dataset and a prompt-based architecture to achieve state-of-the-art results.
Contribution
It develops a large-scale remote sensing dataset and adapts DirectSAM with a prompt module for improved semantic contour extraction.
Findings
Achieves state-of-the-art performance on remote sensing benchmarks.
Effective zero-shot and fine-tuning capabilities demonstrated.
Large-scale dataset significantly enhances model training.
Abstract
The Direct Segment Anything Model (DirectSAM) excels in class-agnostic contour extraction. In this paper, we explore its use by applying it to optical remote sensing imagery, where semantic contour extraction-such as identifying buildings, road networks, and coastlines-holds significant practical value. Those applications are currently handled via training specialized small models separately on small datasets in each domain. We introduce a foundation model derived from DirectSAM, termed DirectSAM-RS, which not only inherits the strong segmentation capability acquired from natural images, but also benefits from a large-scale dataset we created for remote sensing semantic contour extraction. This dataset comprises over 34k image-text-contour triplets, making it at least 30 times larger than individual dataset. DirectSAM-RS integrates a prompter module: a text encoder and cross-attention…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRemote-Sensing Image Classification · Satellite Image Processing and Photogrammetry · Advanced Image and Video Retrieval Techniques
