RSEdit: Text-Guided Image Editing for Remote Sensing

Chen Zhenyuan; Zhang Zechuan; Zhang Feng

arXiv:2603.13708·cs.CV·May 19, 2026

RSEdit: Text-Guided Image Editing for Remote Sensing

Chen Zhenyuan, Zhang Zechuan, Zhang Feng

PDF

1 Repo 2 Models 1 Datasets

TL;DR

This paper introduces RSEdit, a novel framework for text-guided remote sensing image editing, demonstrating superior instruction-faithful edits while maintaining geospatial integrity.

Contribution

It presents the first comprehensive study of conditioning strategies for remote sensing image editing using off-the-shelf text-to-image models.

Findings

01

RSEdit achieves the best instruction-faithful edits.

02

It preserves geospatial structure effectively.

03

The code and checkpoints are publicly released.

Abstract

In this paper, we explore text-guided image editing in the remote sensing domain using generative modeling. We propose \rsedit, a collection of models from U-Net to DiT with various configurations. Specifically, we present the first comprehensive study of conditioning strategies for building image editing models from off-the-shelf text-to-image ones. Our experiments show that \rsedit achieves the best instruction-faithful edits while preserving geospatial structure. We release the code at \url{https://github.com/Bili-Sakura/RSEdit-Preview} and checkpoints at \url{https://huggingface.co/collections/BiliSakura/rsedit}.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Bili-Sakura/RSEdit-Preview
github

Models

Datasets

BiliSakura/RSEdit-Benchmark-Results
dataset· 279 dl
279 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGeographic Information Systems Studies · Multimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques