Few-Shot Synthetic Data Generation with Diffusion Models for Downstream Vision Tasks
Daniil Dushenev, Nazariy Karpov, Daniil Zinovjev, Alexander Gorin, Konstantin Kulikov

TL;DR
This paper introduces a lightweight synthetic data augmentation method using diffusion models fine-tuned with LoRA adapters, significantly improving rare class recognition in visual tasks with minimal real data.
Contribution
It demonstrates a scalable approach for rare class augmentation by fine-tuning diffusion models with few real images, enhancing performance across diverse visual domains.
Findings
Synthetic augmentation improves rare-class recall and F1 scores.
Moderate synthetic data yields optimal performance, with diminishing returns at higher ratios.
The approach is effective in both medical and industrial visual recognition tasks.
Abstract
Class imbalance is a persistent challenge in visual recognition, particularly in safety-critical domains where collecting positive examples is expensive and rare events are inherently underrepresented. We propose a lightweight synthetic data augmentation pipeline that fine-tunes a LoRA adapter on as few as 20-50 real images of a rare class and uses a pretrained diffusion model to generate synthetic samples for training. We systematically vary the synthetic-to-real ratio and evaluate the approach across two structurally different domains: chest X-ray pathology classification (NIH ChestX-ray14) and industrial surface crack detection (Magnetic Tile Defect dataset). All evaluations are performed on held-out sets of real images only. Across both domains, synthetic augmentation consistently improves rare-class recall and F1 compared to training with real data alone. Performance improves…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
