Training-Free Generation of Diverse and High-Fidelity Images via Prompt Semantic Space Optimization
Debin Meng, Chen Jin, Zheng Gao, Yanran Li, Ioannis Patras, Georgios Tzimiropoulos

TL;DR
This paper introduces TPSO, a training-free, model-agnostic method that enhances diversity in text-to-image diffusion models by exploring underrepresented token embedding regions, leading to more varied and high-quality image generation.
Contribution
The paper proposes TPSO, a novel training-free approach that improves image diversity in diffusion models by optimizing token prompt spaces without degrading image quality.
Findings
Significantly increases diversity scores from 1.10 to 4.18 points.
Maintains high image fidelity despite increased diversity.
Works across multiple diffusion backbones and datasets.
Abstract
Image diversity remains a fundamental challenge for text-to-image diffusion models. Low-diversity models tend to generate repetitive outputs, increasing sampling redundancy and hindering both creative exploration and downstream applications. A primary cause is that generation often collapses toward a strong mode in the learned distribution. Existing attempts to improve diversity, such as noise resampling, prompt rewriting, or steering-based guidance, often still collapse to dominant modes or introduce distortions that degrade image quality. In light of this, we propose Token-Prompt embedding Space Optimization (TPSO), a training-free and model-agnostic module. TPSO introduces learnable parameters to explore underrepresented regions of the token embedding space, reducing the tendency of the model to repeatedly generate samples from strong modes of the learned distribution. At the same…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Domain Adaptation and Few-Shot Learning · Model Reduction and Neural Networks
