CIA: Controllable Image Augmentation Framework Based on Stable Diffusion

Mohamed Benkedadra; Dany Rimez; Tiffanie Godelaine; Natarajan; Chidambaram; Hamed Razavi Khosroshahi; Horacio Tellez; Matei Mancas; Benoit; Macq; Sidi Ahmed Mahmoudi

arXiv:2411.16128·cs.CV·November 26, 2024

CIA: Controllable Image Augmentation Framework Based on Stable Diffusion

Mohamed Benkedadra, Dany Rimez, Tiffanie Godelaine, Natarajan, Chidambaram, Hamed Razavi Khosroshahi, Horacio Tellez, Matei Mancas, Benoit, Macq, Sidi Ahmed Mahmoudi

PDF

1 Repo

TL;DR

CIA is a modular framework that uses Stable Diffusion to generate, filter, and control synthetic images for dataset augmentation, significantly improving object detection performance in data-constrained scenarios.

Contribution

Introduces CIA, a novel pipeline combining image generation, quality filtering, and pattern control to enhance dataset augmentation for computer vision tasks.

Findings

01

CIA-generated images improve object detection accuracy.

02

Performance approaches doubling real dataset size.

03

Framework enables effective augmentation in data-limited scenarios.

Abstract

Computer vision tasks such as object detection and segmentation rely on the availability of extensive, accurately annotated datasets. In this work, We present CIA, a modular pipeline, for (1) generating synthetic images for dataset augmentation using Stable Diffusion, (2) filtering out low quality samples using defined quality metrics, (3) forcing the existence of specific patterns in generated images using accurate prompting and ControlNet. In order to show how CIA can be used to search for an optimal augmentation pipeline of training data, we study human object detection in a data constrained scenario, using YOLOv8n on COCO and Flickr30k datasets. We have recorded significant improvement using CIA-generated images, approaching the performances obtained when doubling the amount of real images in the dataset. Our findings suggest that our modular framework can significantly enhance…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

multitel-ai/cia
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsDiffusion