TL;DR
This paper introduces OVRSISBenchV2, a comprehensive benchmark for open-vocabulary remote sensing segmentation, along with Pi-Seg, a new baseline method that enhances transferability through semantic-guided perturbations.
Contribution
The paper presents a large-scale, application-oriented benchmark for OVRSIS and proposes Pi-Seg, a novel baseline that improves generalization with positive-incentive noise.
Findings
Pi-Seg outperforms existing methods on challenging benchmarks.
OVRSISBenchV2 significantly expands scene diversity and semantic coverage.
Perturbation-based training enhances transferability in remote sensing segmentation.
Abstract
Open-vocabulary remote sensing image segmentation (OVRSIS) remains underexplored due to fragmented datasets, limited training diversity, and the lack of evaluation benchmarks that reflect realistic geospatial application demands. Our previous \textit{OVRSISBenchV1} established an initial cross-dataset evaluation protocol, but its limited scope is insufficient for assessing realistic open-world generalization. To address this issue, we propose \textit{OVRSISBenchV2}, a large-scale and application-oriented benchmark for OVRSIS. We first construct \textbf{OVRSIS95K}, a balanced dataset of about 95K image--mask pairs covering 35 common semantic categories across diverse remote sensing scenes. Built upon OVRSIS95K and 10 downstream datasets, OVRSISBenchV2 contains 170K images and 128 categories, substantially expanding scene diversity, semantic coverage, and evaluation difficulty. Beyond…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
