DiffusionSeg: Adapting Diffusion Towards Unsupervised Object Discovery

Chaofan Ma; Yuhuan Yang; Chen Ju; Fei Zhang; Jinxiang Liu; Yu Wang; Ya; Zhang; Yanfeng Wang

arXiv:2303.09813·cs.CV·March 20, 2023·6 cites

DiffusionSeg: Adapting Diffusion Towards Unsupervised Object Discovery

Chaofan Ma, Yuhuan Yang, Chen Ju, Fei Zhang, Jinxiang Liu, Yu Wang, Ya, Zhang, Yanfeng Wang

PDF

Open Access

TL;DR

This paper introduces DiffusionSeg, a novel framework that leverages pre-trained diffusion models for unsupervised object discovery tasks like saliency segmentation and object localization, overcoming structural and data limitations.

Contribution

It proposes a two-stage synthesis-exploitation approach with a training-free mask generation method and an inversion technique to adapt diffusion models for discriminative tasks.

Findings

01

DiffusionSeg outperforms existing methods in unsupervised object discovery.

02

The synthesis stage effectively generates training data to improve performance.

03

The inversion technique bridges the gap between generative and discriminative models.

Abstract

Learning from a large corpus of data, pre-trained models have achieved impressive progress nowadays. As popular generative pre-training, diffusion models capture both low-level visual knowledge and high-level semantic relations. In this paper, we propose to exploit such knowledgeable diffusion models for mainstream discriminative tasks, i.e., unsupervised object discovery: saliency segmentation and object localization. However, the challenges exist as there is one structural difference between generative and discriminative models, which limits the direct use. Besides, the lack of explicitly labeled data significantly limits performance in unsupervised settings. To tackle these issues, we introduce DiffusionSeg, one novel synthesis-exploitation framework containing two-stage strategies. To alleviate data insufficiency, we synthesize abundant images, and propose a novel training-free…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVisual Attention and Saliency Detection · Domain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques

MethodsDiffusion