# FLORA: Efficient Synthetic Data Generation for Object Detection in Low-Data Regimes via finetuning Flux LoRA

**Authors:** Alvaro Patricio, Atabak Dehban, Rodrigo Ventura

arXiv: 2508.21712 · 2025-09-01

## TL;DR

FLORA introduces a lightweight, efficient synthetic data generation method for object detection that uses low-rank adaptation fine-tuning of diffusion models, enabling high-quality results with minimal data and computational resources.

## Contribution

The paper presents FLORA, a novel pipeline that fine-tunes diffusion models with LoRA, reducing resource needs and surpassing state-of-the-art synthetic data quality for object detection.

## Key findings

- Training with 500 synthetic images outperforms 5000 from baseline.
- Achieves up to 21.3% mAP improvement over baseline.
- Requires only 10% of data and computational cost of previous methods.

## Abstract

Recent advances in diffusion-based generative models have demonstrated significant potential in augmenting scarce datasets for object detection tasks. Nevertheless, most recent models rely on resource-intensive full fine-tuning of large-scale diffusion models, requiring enterprise-grade GPUs (e.g., NVIDIA V100) and thousands of synthetic images. To address these limitations, we propose Flux LoRA Augmentation (FLORA), a lightweight synthetic data generation pipeline. Our approach uses the Flux 1.1 Dev diffusion model, fine-tuned exclusively through Low-Rank Adaptation (LoRA). This dramatically reduces computational requirements, enabling synthetic dataset generation with a consumer-grade GPU (e.g., NVIDIA RTX 4090). We empirically evaluate our approach on seven diverse object detection datasets. Our results demonstrate that training object detectors with just 500 synthetic images generated by our approach yields superior detection performance compared to models trained on 5000 synthetic images from the ODGEN baseline, achieving improvements of up to 21.3% in mAP@.50:.95. This work demonstrates that it is possible to surpass state-of-the-art performance with far greater efficiency, as FLORA achieves superior results using only 10% of the data and a fraction of the computational cost. This work demonstrates that a quality and efficiency-focused approach is more effective than brute-force generation, making advanced synthetic data creation more practical and accessible for real-world scenarios.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2508.21712/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/2508.21712/full.md

## References

42 references — full list in the complete paper: https://tomesphere.com/paper/2508.21712/full.md

---
Source: https://tomesphere.com/paper/2508.21712