Panoptic Diffusion Models: co-generation of images and segmentation maps
Yinghan Long, Kaushik Roy

TL;DR
Panoptic Diffusion Models (PDM) are the first to generate images and detailed segmentation maps simultaneously, enhancing scene understanding and control in diffusion-based image synthesis.
Contribution
PDM introduces a novel framework that co-generates images and panoptic segmentation maps, integrating scene layout understanding into diffusion models.
Findings
PDM achieves state-of-the-art results in image generation with scene control.
The model effectively incorporates text prompts to guide segmentation and image synthesis.
PDM can also perform image-to-image generation when ground-truth maps are available.
Abstract
Recently, diffusion models have demonstrated impressive capabilities in text-guided and image-conditioned image generation. However, existing diffusion models cannot simultaneously generate an image and a panoptic segmentation of objects and stuff from the prompt. Incorporating an inherent understanding of shapes and scene layouts can improve the creativity and realism of diffusion models. To address this limitation, we present Panoptic Diffusion Model (PDM), the first model designed to generate both images and panoptic segmentation maps concurrently. PDM bridges the gap between image and text by constructing segmentation layouts that provide detailed, built-in guidance throughout the generation process. This ensures the inclusion of categories mentioned in text prompts and enriches the diversity of segments within the background. We demonstrate the effectiveness of PDM across two…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMedical Image Segmentation Techniques
MethodsDiffusion
