TL;DR
UAVGen introduces a novel diffusion-based image generation framework that enhances UAV-based object detection by producing high-fidelity, focused synthetic images, significantly improving detection accuracy in challenging scenarios.
Contribution
The paper presents UAVGen, a new layout-to-image generation framework with a visual prototype conditioned diffusion model and focal region data pipeline for improved UAV object detection.
Findings
Outperforms state-of-the-art image synthesis methods in UAV detection tasks.
Enhances detection accuracy across various detectors when integrated.
Produces high-fidelity, focused synthetic images that improve model training.
Abstract
Unmanned aerial vehicle (UAV) based object detection is a critical but challenging task, when applied in dynamically changing scenarios with limited annotated training data. Layout-to-image generation approaches have proved effective in promoting detection accuracy by synthesizing labeled images based on diffusion models. However, they suffer from frequently producing artifacts, especially near layout boundaries of tiny objects, thus substantially limiting their performance. To address these issues, we propose UAVGen, a novel layout-to-image generation framework tailored for UAV-based object detection. Specifically, UAVGen designs a Visual Prototype Conditioned Diffusion Model (VPC-DM) that constructs representative instances for each class and integrates them into latent embeddings for high-fidelity object generation. Moreover, a Focal Region Enhanced Data Pipeline (FRE-DP) is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
