DiffPop: Plausibility-Guided Object Placement Diffusion for Image   Composition

Jiacheng Liu; Hang Zhou; Shida Wei; Rui Ma

arXiv:2406.07852·cs.CV·June 13, 2024

DiffPop: Plausibility-Guided Object Placement Diffusion for Image Composition

Jiacheng Liu, Hang Zhou, Shida Wei, Rui Ma

PDF

Open Access

TL;DR

DiffPop introduces a plausibility-guided diffusion framework for realistic object placement in image composition, leveraging human-in-the-loop training and a structural plausibility classifier to generate diverse, plausible composite images.

Contribution

The paper presents the first plausibility-guided diffusion model for object placement, integrating human feedback and a structural classifier for improved realism in image composition.

Findings

01

Outperforms existing methods in generating plausible composite images

02

Demonstrates versatility in data augmentation and multi-object placement

03

Achieves superior results on Cityscapes-OP and OPA datasets

Abstract

In this paper, we address the problem of plausible object placement for the challenging task of realistic image composition. We propose DiffPop, the first framework that utilizes plausibility-guided denoising diffusion probabilistic model to learn the scale and spatial relations among multiple objects and the corresponding scene image. First, we train an unguided diffusion model to directly learn the object placement parameters in a self-supervised manner. Then, we develop a human-in-the-loop pipeline which exploits human labeling on the diffusion-generated composite images to provide the weak supervision for training a structural plausibility classifier. The classifier is further used to guide the diffusion sampling process towards generating the plausible object placement. Experimental results verify the superiority of our method for producing plausible and diverse composite images on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Generative Adversarial Networks and Image Synthesis · Computer Graphics and Visualization Techniques

MethodsDiffusion