Amodal Instance Segmentation with Diffusion Shape Prior Estimation

Minh Tran; Khoa Vo; Tri Nguyen; Ngan Le

arXiv:2409.18256·cs.CV·October 8, 2024

Amodal Instance Segmentation with Diffusion Shape Prior Estimation

Minh Tran, Khoa Vo, Tri Nguyen, Ngan Le

PDF

Open Access

TL;DR

This paper introduces AISDiff, a novel amodal instance segmentation method that leverages diffusion models for shape prior estimation, improving segmentation accuracy especially in occluded scenarios.

Contribution

We propose AISDiff with a Diffusion Shape Prior Estimation module that uses pretrained diffusion models to enhance amodal segmentation accuracy.

Findings

01

Outperforms existing methods on multiple AIS benchmarks.

02

Effectively models occlusions and predicts complete object shapes.

03

Utilizes pretrained diffusion models for rich shape prior extraction.

Abstract

Amodal Instance Segmentation (AIS) presents an intriguing challenge, including the segmentation prediction of both visible and occluded parts of objects within images. Previous methods have often relied on shape prior information gleaned from training data to enhance amodal segmentation. However, these approaches are susceptible to overfitting and disregard object category details. Recent advancements highlight the potential of conditioned diffusion models, pretrained on extensive datasets, to generate images from latent space. Drawing inspiration from this, we propose AISDiff with a Diffusion Shape Prior Estimation (DiffSP) module. AISDiff begins with the prediction of the visible segmentation mask and object category, alongside occlusion-aware processing through the prediction of occluding masks. Subsequently, these elements are inputted into our DiffSP module to infer the shape prior…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Video Analysis and Summarization · Advanced Image and Video Retrieval Techniques

MethodsDiffusion