Evolutionary Token-Level Prompt Optimization for Diffusion Models
Dom\'icio Pereira Neto, Jo\~ao Correia, Penousal Machado

TL;DR
This paper introduces an evolutionary algorithm-based method to automatically optimize prompts for diffusion models, improving image quality and prompt-image alignment without manual trial-and-error.
Contribution
It presents a novel genetic algorithm approach for token-level prompt optimization, outperforming baseline methods and enhancing diffusion model performance.
Findings
Up to 23.93% improvement in fitness over baselines.
Effective optimization of prompts for 36 prompts from the P2 dataset.
Demonstrates adaptability to various tokenized text encoders.
Abstract
Text-to-image diffusion models exhibit strong generative performance but remain highly sensitive to prompt formulation, often requiring extensive manual trial and error to obtain satisfactory results. This motivates the development of automated, model-agnostic prompt optimization methods that can systematically explore the conditioning space beyond conventional text rewriting. This work investigates the use of a Genetic Algorithm (GA) for prompt optimization by directly evolving the token vectors employed by CLIP-based diffusion models. The GA optimizes a fitness function that combines aesthetic quality, measured by the LAION Aesthetic Predictor V2, with prompt-image alignment, assessed via CLIPScore. Experiments on 36 prompts from the Parti Prompts (P2) dataset show that the proposed approach outperforms the baseline methods, including Promptist and random search, achieving up to a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
