DiffusionInst: Diffusion Model for Instance Segmentation
Zhangxuan Gu, Haoxing Chen, Zhuoer Xu, Jun Lan, Changhua, Meng, Weiqiang Wang

TL;DR
DiffusionInst introduces a novel diffusion-based framework for instance segmentation that models instances as filters and performs noise-to-filter denoising, achieving competitive results on COCO and LVIS datasets.
Contribution
It presents a new diffusion model for instance segmentation that eliminates the need for RPN and uses a noise-to-filter denoising process for mask prediction.
Findings
Achieves competitive performance on COCO and LVIS datasets.
Operates with various backbones like ResNet and Swin Transformers.
Provides a strong baseline for diffusion-based discriminative tasks.
Abstract
Diffusion frameworks have achieved comparable performance with previous state-of-the-art image generation models. Researchers are curious about its variants in discriminative tasks because of its powerful noise-to-image denoising pipeline. This paper proposes DiffusionInst, a novel framework that represents instances as instance-aware filters and formulates instance segmentation as a noise-to-filter denoising process. The model is trained to reverse the noisy groundtruth without any inductive bias from RPN. During inference, it takes a randomly generated filter as input and outputs mask in one-step or multi-step denoising. Extensive experimental results on COCO and LVIS show that DiffusionInst achieves competitive performance compared to existing instance segmentation models with various backbones, such as ResNet and Swin Transformers. We hope our work could serve as a strong baseline,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Generative Adversarial Networks and Image Synthesis · Image Retrieval and Classification Techniques
Methods1x1 Convolution · Average Pooling · Residual Connection · Batch Normalization · Max Pooling · *Communicated@Fast*How Do I Communicate to Expedia? · Kaiming Initialization · Convolution · Bottleneck Residual Block · Residual Block
