RSGen: Enhancing Layout-Driven Remote Sensing Image Generation with Diverse Edge Guidance
Xianbao Hou, Yonghao He, Zeyd Boukhers, John See, Hu Su, Wei Sui, Cong Yang

TL;DR
RSGen is a novel framework that enhances remote sensing image generation by integrating diverse edge guidance, improving control over layout adherence and diversity in generated images, and boosting performance in downstream detection tasks.
Contribution
RSGen introduces a plug-and-play method that uses diverse edge maps to improve layout-driven remote sensing image synthesis with strict bounding box adherence.
Findings
Significant improvements in detection metrics on DOTA dataset.
Enhanced diversity and layout control in generated images.
Boosted performance of baseline models with RSGen.
Abstract
Diffusion models have significantly mitigated the impact of annotated data scarcity in remote sensing (RS). Although recent approaches have successfully harnessed these models to enable diverse and controllable Layout-to-Image (L2I) synthesis, they still suffer from limited fine-grained control and fail to strictly adhere to bounding box constraints. To address these limitations, we propose RSGen, a plug-and-play framework that leverages diverse edge guidance to enhance layout-driven RS image generation. Specifically, RSGen employs a progressive enhancement strategy: 1) it first enriches the diversity of edge maps composited from retrieved training instances via Image-to-Image generation; and 2) subsequently utilizes these diverse edge maps as conditioning for existing L2I models to enforce pixel-level control within bounding boxes, ensuring the generated instances strictly adhere to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Generative Adversarial Networks and Image Synthesis
