All Seeds Are Not Equal: Enhancing Compositional Text-to-Image   Generation with Reliable Random Seeds

Shuangqi Li; Hieu Le; Jingyi Xu; Mathieu Salzmann

arXiv:2411.18810·cs.CV·March 21, 2025

All Seeds Are Not Equal: Enhancing Compositional Text-to-Image Generation with Reliable Random Seeds

Shuangqi Li, Hieu Le, Jingyi Xu, Mathieu Salzmann

PDF

Open Access

TL;DR

This paper investigates how initial random seeds affect compositional image generation in diffusion models and proposes a method to select reliable seeds and fine-tune models for improved consistency and accuracy.

Contribution

The paper introduces a seed mining technique to identify reliable initial noise patterns, enhancing compositional image generation without manual annotation.

Findings

01

Significant improvement in compositional accuracy after fine-tuning.

02

Reliable seeds lead to more consistent object placement in generated images.

03

Quantitative gains of up to 60.7% in spatial composition accuracy.

Abstract

Text-to-image diffusion models have demonstrated remarkable capability in generating realistic images from arbitrary text prompts. However, they often produce inconsistent results for compositional prompts such as "two dogs" or "a penguin on the right of a bowl". Understanding these inconsistencies is crucial for reliable image generation. In this paper, we highlight the significant role of initial noise in these inconsistencies, where certain noise patterns are more reliable for compositional prompts than others. Our analyses reveal that different initial random seeds tend to guide the model to place objects in distinct image areas, potentially adhering to specific patterns of camera angles and image composition associated with the seed. To improve the model's compositional ability, we propose a method for mining these reliable cases, resulting in a curated training set of generated…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Retrieval and Classification Techniques · Advanced Image and Video Retrieval Techniques

MethodsSparse Evolutionary Training · Diffusion