Flow-Multi: A Flow-Matching Multi-Reward Framework for Text-to-Image Generation

Jaegun Lee; Janghoon Choi

PMC · DOI:10.3390/s26041120·February 9, 2026

Flow-Multi: A Flow-Matching Multi-Reward Framework for Text-to-Image Generation

Jaegun Lee, Janghoon Choi

PDF

Open Access

TL;DR

This paper introduces Flow-Multi, a new framework for text-to-image generation that uses multiple reward functions to improve image quality and alignment with human preferences.

Contribution

The novel contribution is a multi-reward reinforcement learning framework using flow-matching and Pareto dominance to avoid overfitting and reward hacking.

Findings

01

Flow-Multi achieves balanced improvements across multiple reward criteria compared to Flow-GRPO.

02

The use of Pareto dominance and advantage masking improves policy optimization by focusing on high-quality rewards.

03

The framework demonstrates stable alignment in text-to-image generation without overfitting to specific metrics.

Abstract

Recent approaches in text-to-image (T2I) generation have actively adopted reinforcement learning (RL) techniques for human preference alignment. However, existing approaches primarily rely on a single reward function, which can lead to overfitting on specific metrics, resulting in issues such as reward hacking and imbalanced optimization among multiple objectives. To address this, we propose Flow-Multi: a flow-matching multi-reward framework for text-to-image generation. Our method builds upon flow-matching-based group-relative policy optimization (GRPO) learning. Each sample is evaluated by four reward models—based on text-to-image alignment, human preference, aesthetic quality, and GenEval—to create a multi-dimensional reward vector. We then utilize the Pareto dominance relationship to remove dominated samples and update the policy using only the non-dominated set. Additionally, we…

Linked entities

Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.

Species1

Homo sapiens(human · species)

Chemicals1

DPO

Diseases1

injury to

Figures5

Click any figure to enlarge with its caption.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Generative Adversarial Networks and Image Synthesis · Artificial Intelligence in Games