Dataset Diffusion: Diffusion-based Synthetic Dataset Generation for Pixel-Level Semantic Segmentation
Quang Nguyen, Truong Vu, Anh Tran, Khoi Nguyen

TL;DR
This paper introduces a novel method using Stable Diffusion to generate pixel-level semantic segmentation labels from synthetic images, reducing manual annotation effort and improving segmentation training.
Contribution
The paper presents three innovative techniques leveraging Stable Diffusion's attention mechanisms to produce accurate segmentation maps from text prompts, enabling synthetic dataset creation for semantic segmentation.
Findings
Outperforms existing methods on PASCAL VOC and MSCOCO datasets.
Effectively incorporates uncertainty regions to handle pseudo-label imperfections.
Provides publicly available code and benchmarks for reproducibility.
Abstract
Preparing training data for deep vision models is a labor-intensive task. To address this, generative models have emerged as an effective solution for generating synthetic data. While current generative models produce image-level category labels, we propose a novel method for generating pixel-level semantic segmentation labels using the text-to-image generative model Stable Diffusion (SD). By utilizing the text prompts, cross-attention, and self-attention of SD, we introduce three new techniques: class-prompt appending, class-prompt cross-attention, and self-attention exponentiation. These techniques enable us to generate segmentation maps corresponding to synthetic images. These maps serve as pseudo-labels for training semantic segmenters, eliminating the need for labor-intensive pixel-wise annotation. To account for the imperfections in our pseudo-labels, we incorporate uncertainty…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Multimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning
MethodsDiffusion
