Chimera: Compositional Image Generation using Part-based Concepting
Shivam Singh, Yiming Chen, Agneet Chatterjee, Amit Raj, James Hays, Yezhou Yang, Chitta Baral

TL;DR
Chimera is a novel image generation model that combines parts from multiple images based on text instructions, enabling precise control over object composition without user annotations.
Contribution
We introduce Chimera, a new model for compositional image generation that uses part-based concepting and a custom dataset to improve control and fidelity.
Findings
Outperforms baselines by 14% in part alignment and compositional accuracy.
Achieves 21% improvement in visual quality.
Introduces PartEval, a metric for assessing compositional fidelity.
Abstract
Personalized image generative models are highly proficient at synthesizing images from text or a single image, yet they lack explicit control for composing objects from specific parts of multiple source images without user specified masks or annotations. To address this, we introduce Chimera, a personalized image generation model that generates novel objects by combining specified parts from different source images according to textual instructions. To train our model, we first construct a dataset from a taxonomy built on 464 unique (part, subject) pairs, which we term semantic atoms. From this, we generate 37k prompts and synthesize the corresponding images with a high-fidelity text-to-image model. We train a custom diffusion prior model with part-conditional guidance, which steers the image-conditioning features to enforce both semantic identity and spatial layout. We also introduce…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Computer Graphics and Visualization Techniques · 3D Shape Modeling and Analysis
