Composing Parts for Expressive Object Generation
Harsh Rangwani, Aishwarya Agarwal, Kuldeep Kulkarni, R. Venkatesh Babu, and Srikrishna Karanam

TL;DR
PartComposer is a training-free method that enhances fine-grained part-level control in image generation by localizing object parts and applying localized diffusion, enabling more detailed and customizable object compositions.
Contribution
It introduces PartComposer, a novel approach that localizes object parts and applies localized diffusion without additional training, improving control over object parts in generated images.
Findings
Effective part-level control demonstrated visually
Quantitative improvements over baselines
Generalizes across different domains
Abstract
Image composition and generation are processes where the artists need control over various parts of the generated images. However, the current state-of-the-art generation models, like Stable Diffusion, cannot handle fine-grained part-level attributes in the text prompts. Specifically, when additional attribute details are added to the base text prompt, these text-to-image models either generate an image vastly different from the image generated from the base prompt or ignore the attribute details. To mitigate these issues, we introduce PartComposer, a training-free method that enables image generation based on fine-grained part-level attributes specified for objects in the base text prompt. This allows more control for artists and enables novel object compositions by combining distinctive object parts. PartComposer first localizes object parts by denoising the object region from a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Motion and Animation · Image Processing and 3D Reconstruction
MethodsBalanced Selection · Diffusion
