Collage Diffusion

Vishnu Sarukkai; Linden Li; Arden Ma; Christopher R\'e; Kayvon; Fatahalian

arXiv:2303.00262·cs.CV·September 1, 2023·1 cites

Collage Diffusion

Vishnu Sarukkai, Linden Li, Arden Ma, Christopher R\'e, Kayvon, Fatahalian

PDF

Open Access 1 Video

TL;DR

Collage Diffusion introduces a method for precise, layer-based control over diffusion image generation, enabling users to specify object placement and attributes, and iteratively edit images while maintaining object fidelity.

Contribution

It presents a novel layer-based approach that allows detailed control and editing of generated images, improving object placement and attribute preservation compared to prior methods.

Findings

01

Better object placement accuracy in generated images

02

Enhanced preservation of key visual attributes

03

Enables iterative object editing in generated images

Abstract

We seek to give users precise control over diffusion-based image generation by modeling complex scenes as sequences of layers, which define the desired spatial arrangement and visual attributes of objects in the scene. Collage Diffusion harmonizes the input layers to make objects fit together -- the key challenge involves minimizing changes in the positions and key visual attributes of the input layers while allowing other attributes to change in the harmonization process. We ensure that objects are generated in the correct locations by modifying text-image cross-attention with the layers' alpha masks. We preserve key visual attributes of input layers by learning specialized text representations per layer and by extending ControlNet to operate on layers. Layer input allows users to control the extent of image harmonization on a per-object basis, and users can even iteratively edit…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Collage Diffusion· youtube

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Image and Video Retrieval Techniques · Computer Graphics and Visualization Techniques

MethodsDiffusion