DiffPop: Plausibility-Guided Object Placement Diffusion for Image Composition
Jiacheng Liu, Hang Zhou, Shida Wei, Rui Ma

TL;DR
DiffPop introduces a plausibility-guided diffusion framework for realistic object placement in image composition, leveraging human-in-the-loop training and a structural plausibility classifier to generate diverse, plausible composite images.
Contribution
The paper presents the first plausibility-guided diffusion model for object placement, integrating human feedback and a structural classifier for improved realism in image composition.
Findings
Outperforms existing methods in generating plausible composite images
Demonstrates versatility in data augmentation and multi-object placement
Achieves superior results on Cityscapes-OP and OPA datasets
Abstract
In this paper, we address the problem of plausible object placement for the challenging task of realistic image composition. We propose DiffPop, the first framework that utilizes plausibility-guided denoising diffusion probabilistic model to learn the scale and spatial relations among multiple objects and the corresponding scene image. First, we train an unguided diffusion model to directly learn the object placement parameters in a self-supervised manner. Then, we develop a human-in-the-loop pipeline which exploits human labeling on the diffusion-generated composite images to provide the weak supervision for training a structural plausibility classifier. The classifier is further used to guide the diffusion sampling process towards generating the plausible object placement. Experimental results verify the superiority of our method for producing plausible and diverse composite images on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Generative Adversarial Networks and Image Synthesis · Computer Graphics and Visualization Techniques
MethodsDiffusion
