Composing Parts for Expressive Object Generation

Harsh Rangwani; Aishwarya Agarwal; Kuldeep Kulkarni; R. Venkatesh Babu; and Srikrishna Karanam

arXiv:2406.10197·cs.CV·July 1, 2025

Composing Parts for Expressive Object Generation

Harsh Rangwani, Aishwarya Agarwal, Kuldeep Kulkarni, R. Venkatesh Babu, and Srikrishna Karanam

PDF

Open Access

TL;DR

PartComposer is a training-free method that enhances fine-grained part-level control in image generation by localizing object parts and applying localized diffusion, enabling more detailed and customizable object compositions.

Contribution

It introduces PartComposer, a novel approach that localizes object parts and applies localized diffusion without additional training, improving control over object parts in generated images.

Findings

01

Effective part-level control demonstrated visually

02

Quantitative improvements over baselines

03

Generalizes across different domains

Abstract

Image composition and generation are processes where the artists need control over various parts of the generated images. However, the current state-of-the-art generation models, like Stable Diffusion, cannot handle fine-grained part-level attributes in the text prompts. Specifically, when additional attribute details are added to the base text prompt, these text-to-image models either generate an image vastly different from the image generated from the base prompt or ignore the attribute details. To mitigate these issues, we introduce PartComposer, a training-free method that enables image generation based on fine-grained part-level attributes specified for objects in the base text prompt. This allows more control for artists and enables novel object compositions by combining distinctive object parts. PartComposer first localizes object parts by denoising the object region from a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Motion and Animation · Image Processing and 3D Reconstruction

MethodsBalanced Selection · Diffusion