Learning to Sample Effective and Diverse Prompts for Text-to-Image Generation
Taeyoung Yun, Dinghuai Zhang, Jinkyoo Park, and Ling Pan

TL;DR
This paper introduces PAG, a novel prompt adaptation method using GFlowNets for text-to-image generation, enabling diverse, high-quality prompts and overcoming limitations of reinforcement learning approaches.
Contribution
We propose PAG, a GFlowNet-based framework for prompt adaptation that improves diversity and effectiveness in text-to-image generation, addressing mode collapse and neural plasticity issues.
Findings
PAG generates more diverse prompts than RL-based methods.
PAG demonstrates robustness across different reward functions.
PAG transfers effectively to various text-to-image models.
Abstract
Recent advances in text-to-image diffusion models have achieved impressive image generation capabilities. However, it remains challenging to control the generation process with desired properties (e.g., aesthetic quality, user intention), which can be expressed as black-box reward functions. In this paper, we focus on prompt adaptation, which refines the original prompt into model-preferred prompts to generate desired images. While prior work uses reinforcement learning (RL) to optimize prompts, we observe that applying RL often results in generating similar postfixes and deterministic behaviors. To this end, we introduce \textbf{P}rompt \textbf{A}daptation with \textbf{G}FlowNets (\textbf{PAG}), a novel approach that frames prompt adaptation as a probabilistic inference problem. Our key insight is that leveraging Generative Flow Networks (GFlowNets) allows us to shift from reward…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Analysis and Summarization · Multimodal Machine Learning Applications · Educational Games and Gamification
MethodsDiffusion · Perturbed-Attention Guidance · Focus
