A Simple Latent Diffusion Approach for Panoptic Segmentation and Mask   Inpainting

Wouter Van Gansbeke; Bert De Brabandere

arXiv:2401.10227·cs.CV·July 17, 2024·1 cites

A Simple Latent Diffusion Approach for Panoptic Segmentation and Mask Inpainting

Wouter Van Gansbeke, Bert De Brabandere

PDF

Open Access 1 Repo

TL;DR

This paper introduces a simplified latent diffusion framework for panoptic segmentation and mask inpainting, eliminating complex components of traditional methods and demonstrating strong results on COCO and ADE20k datasets.

Contribution

It presents a novel latent diffusion approach that simplifies panoptic segmentation architecture and enables mask inpainting, with multi-task adaptability.

Findings

01

Strong segmentation results on COCO and ADE20k

02

Effective mask inpainting capabilities

03

Flexible multi-task learning with task embeddings

Abstract

Panoptic and instance segmentation networks are often trained with specialized object detection modules, complex loss functions, and ad-hoc post-processing steps to manage the permutation-invariance of the instance masks. This work builds upon Stable Diffusion and proposes a latent diffusion approach for panoptic segmentation, resulting in a simple architecture that omits these complexities. Our training consists of two steps: (1) training a shallow autoencoder to project the segmentation masks to latent space; (2) training a diffusion model to allow image-conditioned sampling in latent space. This generative approach unlocks the exploration of mask completion or inpainting. The experimental validation on COCO and ADE20k yields strong segmentation results. Finally, we demonstrate our model's adaptability to multi-tasking by introducing learnable task embeddings.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

segments-ai/latent-diffusion-segmentation
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Neural Network Applications · Advanced Image and Video Retrieval Techniques

MethodsDiffusion