Foreground-Background Separation through Concept Distillation from Generative Image Foundation Models
Mischa Dombrowski, Hadrien Reynaud, Matthew Baugh, Bernhard Kainz

TL;DR
This paper introduces a method to generate foreground-background segmentation models from textual descriptions using pre-trained generative models, eliminating the need for labeled datasets and achieving competitive results.
Contribution
It presents a novel approach leveraging latent diffusion models to produce segmentation masks and fine-tune models without pixel-wise labels, advancing unsupervised segmentation techniques.
Findings
Outperforms previous unsupervised segmentation methods
Achieves results close to fully supervised models
Demonstrates effectiveness on multiple object categories and medical images
Abstract
Curating datasets for object segmentation is a difficult task. With the advent of large-scale pre-trained generative models, conditional image generation has been given a significant boost in result quality and ease of use. In this paper, we present a novel method that enables the generation of general foreground-background segmentation models from simple textual descriptions, without requiring segmentation labels. We leverage and explore pre-trained latent diffusion models, to automatically generate weak segmentation masks for concepts and objects. The masks are then used to fine-tune the diffusion model on an inpainting task, which enables fine-grained removal of the object, while at the same time providing a synthetic foreground and background dataset. We demonstrate that using this method beats previous methods in both discriminative and generative performance and closes the gap…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Multimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques
MethodsInpainting · Diffusion
