Towards Realistic Open-Vocabulary Remote Sensing Segmentation: Benchmark and Baseline

Bingyu Li; Tao Huo; Haocheng Dong; Da Zhang; Zhiyuan Zhao; Junyu Gao; and Xuelong Li

arXiv:2604.15652·cs.CV·April 20, 2026

Towards Realistic Open-Vocabulary Remote Sensing Segmentation: Benchmark and Baseline

Bingyu Li, Tao Huo, Haocheng Dong, Da Zhang, Zhiyuan Zhao, Junyu Gao, and Xuelong Li

PDF

1 Repo

TL;DR

This paper introduces OVRSISBenchV2, a comprehensive benchmark for open-vocabulary remote sensing segmentation, along with Pi-Seg, a new baseline method that enhances transferability through semantic-guided perturbations.

Contribution

The paper presents a large-scale, application-oriented benchmark for OVRSIS and proposes Pi-Seg, a novel baseline that improves generalization with positive-incentive noise.

Findings

01

Pi-Seg outperforms existing methods on challenging benchmarks.

02

OVRSISBenchV2 significantly expands scene diversity and semantic coverage.

03

Perturbation-based training enhances transferability in remote sensing segmentation.

Abstract

Open-vocabulary remote sensing image segmentation (OVRSIS) remains underexplored due to fragmented datasets, limited training diversity, and the lack of evaluation benchmarks that reflect realistic geospatial application demands. Our previous \textit{OVRSISBenchV1} established an initial cross-dataset evaluation protocol, but its limited scope is insufficient for assessing realistic open-world generalization. To address this issue, we propose \textit{OVRSISBenchV2}, a large-scale and application-oriented benchmark for OVRSIS. We first construct \textbf{OVRSIS95K}, a balanced dataset of about 95K image--mask pairs covering 35 common semantic categories across diverse remote sensing scenes. Built upon OVRSIS95K and 10 downstream datasets, OVRSISBenchV2 contains 170K images and 128 categories, substantially expanding scene diversity, semantic coverage, and evaluation difficulty. Beyond…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

LiBingyu01/RSKT-Seg/tree/Pi-Seg
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.