Annotation-Free Open-Vocabulary Segmentation for Remote-Sensing Images

Kaiyu Li; Xiangyong Cao; Ruixun Liu; Shihong Wang; Zixuan Jiang; Zhi Wang; Deyu Meng

arXiv:2508.18067·cs.CV·August 26, 2025

Annotation-Free Open-Vocabulary Segmentation for Remote-Sensing Images

Kaiyu Li, Xiangyong Cao, Ruixun Liu, Shihong Wang, Zixuan Jiang, Zhi Wang, Deyu Meng

PDF

3 Models

TL;DR

This paper introduces SegEarth-OV, a novel framework for annotation-free open-vocabulary segmentation of remote sensing images, effectively handling scale variations and fine details without manual annotations, and extends to SAR images via knowledge distillation.

Contribution

The paper presents SegEarth-OV, the first annotation-free open-vocabulary segmentation framework for RS images, and introduces AlignEarth for cross-modal knowledge transfer to SAR data.

Findings

01

Significant performance improvements over state-of-the-art methods.

02

Effective handling of scale variations and fine details in RS images.

03

Successful extension to SAR images using knowledge distillation.

Abstract

Semantic segmentation of remote sensing (RS) images is pivotal for comprehensive Earth observation, but the demand for interpreting new object categories, coupled with the high expense of manual annotation, poses significant challenges. Although open-vocabulary semantic segmentation (OVSS) offers a promising solution, existing frameworks designed for natural images are insufficient for the unique complexities of RS data. They struggle with vast scale variations and fine-grained details, and their adaptation often relies on extensive, costly annotations. To address this critical gap, this paper introduces SegEarth-OV, the first framework for annotation-free open-vocabulary segmentation of RS images. Specifically, we propose SimFeatUp, a universal upsampler that robustly restores high-resolution spatial details from coarse features, correcting distorted target shapes without any…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.