Exploring Text-Guided Single Image Editing for Remote Sensing Images

Fangzhou Han; Lingyu Si; Zhizhuo Jiang; Hongwei Dong; Lamei Zhang; Yu Liu; Hao Chen; Bo Du

arXiv:2405.05769·cs.CV·July 2, 2025

Exploring Text-Guided Single Image Editing for Remote Sensing Images

Fangzhou Han, Lingyu Si, Zhizhuo Jiang, Hongwei Dong, Lamei Zhang, Yu Liu, Hao Chen, Bo Du

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel text-guided single-image editing method for remote sensing images that operates with minimal training data, leveraging pre-trained vision-language models and prompt ensembling to improve accuracy and controllability.

Contribution

It proposes a new RSI editing approach using only one image for training, employing multi-scale training and prompt ensembling to overcome dataset limitations and semantic ambiguity.

Findings

01

Outperforms existing methods in CLIP scores

02

Achieves high subjective quality in edits

03

Supports practical disaster assessment tasks

Abstract

Artificial intelligence generative content (AIGC) has significantly impacted image generation in the field of remote sensing. However, the equally important area of remote sensing image (RSI) editing has not received sufficient attention. Deep learning based editing methods generally involve two sequential stages: generation and editing. For natural images, these stages primarily rely on generative backbones pre-trained on large-scale benchmark datasets and text guidance facilitated by vision-language models (VLMs). However, it become less viable for RSIs: First, existing generative RSI benchmark datasets do not fully capture the diversity of RSIs, and is often inadequate for universal editing tasks. Second, the single text semantic corresponds to multiple image semantics, leading to the introduction of incorrect semantics. To solve above problems, this paper proposes a text-guided RSI…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hit-philiphan/remote_sensing_image_editing
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Geological Modeling and Analysis

MethodsDiffusion · Contrastive Language-Image Pre-training