SegEarth-OV: Towards Training-Free Open-Vocabulary Segmentation for   Remote Sensing Images

Kaiyu Li; Ruixun Liu; Xiangyong Cao; Xueru Bai; Feng Zhou; Deyu Meng,; Zhi Wang

arXiv:2410.01768·cs.CV·November 5, 2024

SegEarth-OV: Towards Training-Free Open-Vocabulary Segmentation for Remote Sensing Images

Kaiyu Li, Ruixun Liu, Xiangyong Cao, Xueru Bai, Feng Zhou, Deyu Meng,, Zhi Wang

PDF

Open Access 2 Repos

TL;DR

This paper introduces a training-free open-vocabulary segmentation method for remote sensing images, utilizing a novel upsampler and token bias correction, achieving significant improvements across multiple datasets.

Contribution

Proposes SimFeatUp, a training-free upsampler, and a token subtraction technique to enhance open-vocabulary segmentation in remote sensing images, addressing low-resolution and boundary issues.

Findings

01

Achieves up to 15.3% improvement over state-of-the-art methods.

02

Effective across diverse remote sensing tasks and datasets.

03

Introduces training-free techniques suitable for practical applications.

Abstract

Remote sensing image plays an irreplaceable role in fields such as agriculture, water resources, military, and disaster relief. Pixel-level interpretation is a critical aspect of remote sensing image applications; however, a prevalent limitation remains the need for extensive manual annotation. For this, we try to introduce open-vocabulary semantic segmentation (OVSS) into the remote sensing context. However, due to the sensitivity of remote sensing images to low-resolution features, distorted target shapes and ill-fitting boundaries are exhibited in the prediction mask. To tackle this issue, we propose a simple and general upsampler, SimFeatUp, to restore lost spatial information in deep features in a training-free style. Further, based on the observation of the abnormal response of local patch tokens to [CLS] token in CLIP, we propose to execute a straightforward subtraction operation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Retrieval and Classification Techniques · Advanced Image and Video Retrieval Techniques

MethodsContrastive Language-Image Pre-training