Enhanced Generative Data Augmentation for Semantic Segmentation via Stronger Guidance
Quang-Huy Che, Duc-Tri Le, Bich-Nga Pham, Duc-Khai Lam, Vinh-Tiep Nguyen

TL;DR
This paper presents a novel data augmentation pipeline using a controllable diffusion model with enhanced prompt and visual reference techniques, significantly improving semantic segmentation performance by generating high-quality, structured synthetic images.
Contribution
The work introduces a new augmentation method leveraging Class-Prompt Appending and Visual Prior Blending to produce more accurate synthetic images for semantic segmentation.
Findings
Improved segmentation accuracy on PASCAL VOC datasets.
Effective generation of diverse, high-quality synthetic images.
Enhanced class balance in training datasets.
Abstract
Data augmentation is crucial for pixel-wise annotation tasks like semantic segmentation, where labeling requires significant effort and intensive labor. Traditional methods, involving simple transformations such as rotations and flips, create new images but often lack diversity along key semantic dimensions and fail to alter high-level semantic properties. To address this issue, generative models have emerged as an effective solution for augmenting data by generating synthetic images. Controllable Generative models offer data augmentation methods for semantic segmentation tasks by using prompts and visual references from the original image. However, these models face challenges in generating synthetic images that accurately reflect the content and structure of the original image due to difficulties in creating effective prompts and visual references. In this work, we introduce an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Machine Learning and Data Classification · Natural Language Processing Techniques
MethodsSoftmax · Attention Is All You Need · Diffusion
