The Intricate Dance of Prompt Complexity, Quality, Diversity, and Consistency in T2I Models

Zhang Xiaofeng; Aaron Courville; Michal Drozdzal; Adriana Romero-Soriano

arXiv:2510.19557·cs.CV·February 24, 2026

The Intricate Dance of Prompt Complexity, Quality, Diversity, and Consistency in T2I Models

Zhang Xiaofeng, Aaron Courville, Michal Drozdzal, Adriana Romero-Soriano

PDF

Open Access

TL;DR

This paper investigates how prompt complexity affects the quality, diversity, and consistency of images generated by text-to-image models, revealing trade-offs and proposing an evaluation framework.

Contribution

It introduces a new framework for comparing real and synthetic data utility and analyzes the impact of prompt complexity on T2I model outputs across multiple datasets.

Findings

01

Higher prompt complexity reduces diversity and consistency.

02

Increasing prompt complexity decreases distribution shift between synthetic and real data.

03

Prompt expansion with a language model improves diversity and aesthetics.

Abstract

Text-to-image (T2I) models offer great potential for creating virtually limitless synthetic data, a valuable resource compared to fixed and finite real datasets. Previous works evaluate the utility of synthetic data from T2I models on three key desiderata: quality, diversity, and consistency. While prompt engineering is the primary means of interacting with T2I models, the systematic impact of prompt complexity on these critical utility axes remains underexplored. In this paper, we first conduct synthetic experiments to motivate the difficulty of generalization with regard to prompt complexity and explain the observed difficulty with theoretical derivations. Then, we introduce a new evaluation framework that can compare the utility of real data and synthetic data, and present a comprehensive analysis of how prompt complexity influences the utility of synthetic data generated by commonly…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications