Policy Optimized Text-to-Image Pipeline Design

Uri Gadot; Rinon Gal; Yftah Ziser; Gal Chechik; Shie Mannor

arXiv:2505.21478·cs.CV·November 4, 2025

Policy Optimized Text-to-Image Pipeline Design

Uri Gadot, Rinon Gal, Yftah Ziser, Gal Chechik, Shie Mannor

PDF

Open Access

TL;DR

This paper presents a reinforcement learning framework for designing text-to-image pipelines that improves image quality and diversity while reducing computational costs, surpassing existing automated methods.

Contribution

It introduces a reward model ensemble and a two-phase training strategy with GRPO optimization for efficient pipeline design in text-to-image generation.

Findings

01

Achieves higher image quality than baseline methods.

02

Creates more diverse pipeline workflows.

03

Reduces computational costs compared to traditional approaches.

Abstract

Text-to-image generation has evolved beyond single monolithic models to complex multi-component pipelines. These combine fine-tuned generators, adapters, upscaling blocks and even editing steps, leading to significant improvements in image quality. However, their effective design requires substantial expertise. Recent approaches have shown promise in automating this process through large language models (LLMs), but they suffer from two critical limitations: extensive computational requirements from generating images with hundreds of predefined pipelines, and poor generalization beyond memorized training examples. We introduce a novel reinforcement learning-based framework that addresses these inefficiencies. Our approach first trains an ensemble of reward models capable of predicting image quality scores directly from prompt-workflow combinations, eliminating the need for costly image…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsICT Impact and Policies · Multimedia Communication and Technology