ObjectComposer: Consistent Generation of Multiple Objects Without   Fine-tuning

Alec Helbling; Evan Montoya; Duen Horng Chau

arXiv:2310.06968·cs.CV·October 12, 2023

ObjectComposer: Consistent Generation of Multiple Objects Without Fine-tuning

Alec Helbling, Evan Montoya, Duen Horng Chau

PDF

Open Access

TL;DR

ObjectComposer is a training-free method that enables consistent multi-object image generation from text prompts, leveraging preexisting models without fine-tuning, useful for applications like comic illustrations.

Contribution

It introduces a novel, training-free approach for generating consistent multi-object images without modifying underlying diffusion model weights.

Findings

01

Successfully generates multi-object compositions with consistent object appearances.

02

Operates without fine-tuning, reducing computational costs.

03

Builds upon BLIP-Diffusion for object-specific image generation.

Abstract

Recent text-to-image generative models can generate high-fidelity images from text prompts. However, these models struggle to consistently generate the same objects in different contexts with the same appearance. Consistent object generation is important to many downstream tasks like generating comic book illustrations with consistent characters and setting. Numerous approaches attempt to solve this problem by extending the vocabulary of diffusion models through fine-tuning. However, even lightweight fine-tuning approaches can be prohibitively expensive to run at scale and in real-time. We introduce a method called ObjectComposer for generating compositions of multiple objects that resemble user-specified images. Our approach is training-free, leveraging the abilities of preexisting models. We build upon the recent BLIP-Diffusion model, which can generate images of single objects…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Artificial Intelligence in Games · Computer Graphics and Visualization Techniques

MethodsDiffusion