ViRefSAM: Visual Reference-Guided Segment Anything Model for Remote Sensing Segmentation

Hanbo Bi; Yulong Xu; Ya Li; Yongqiang Mao; Boyuan Tong; Chongyang Li; Chunbo Lang; Wenhui Diao; Hongqi Wang; Yingchao Feng; Xian Sun

arXiv:2507.02294·cs.CV·July 4, 2025

ViRefSAM: Visual Reference-Guided Segment Anything Model for Remote Sensing Segmentation

Hanbo Bi, Yulong Xu, Ya Li, Yongqiang Mao, Boyuan Tong, Chongyang Li, Chunbo Lang, Wenhui Diao, Hongqi Wang, Yingchao Feng, Xian Sun

PDF

TL;DR

ViRefSAM enhances the Segment Anything Model for remote sensing by enabling automatic, few-shot, class-specific segmentation without manual prompts, addressing domain adaptation and efficiency issues.

Contribution

Introduces ViRefSAM, a novel framework with a visual prompt encoder and dynamic target adapter, improving SAM's performance on remote sensing segmentation with minimal reference images.

Findings

01

Outperforms existing few-shot segmentation methods on multiple benchmarks.

02

Enables accurate segmentation of unseen classes with few reference images.

03

Effectively bridges the domain gap for remote sensing images.

Abstract

The Segment Anything Model (SAM), with its prompt-driven paradigm, exhibits strong generalization in generic segmentation tasks. However, applying SAM to remote sensing (RS) images still faces two major challenges. First, manually constructing precise prompts for each image (e.g., points or boxes) is labor-intensive and inefficient, especially in RS scenarios with dense small objects or spatially fragmented distributions. Second, SAM lacks domain adaptability, as it is pre-trained primarily on natural images and struggles to capture RS-specific semantics and spatial characteristics, especially when segmenting novel or unseen classes. To address these issues, inspired by few-shot learning, we propose ViRefSAM, a novel framework that guides SAM utilizing only a few annotated reference images that contain class-specific objects. Without requiring manual prompts, ViRefSAM enables automatic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsAdapter · Segment Anything Model · Focus