Fine-tuning Flow Matching Generative Models with Intermediate Feedback
Jiajun Fan, Chaoran Cheng, Shuaike Shen, Xiangxin Zhou, Ge Liu

TL;DR
This paper introduces AC-Flow, an actor-critic framework for stable and effective fine-tuning of flow-based text-to-image models using intermediate feedback, overcoming previous training instabilities.
Contribution
The paper proposes a novel AC-Flow method with reward shaping, dual-stability, and critic weighting to improve fine-tuning stability and performance of flow models.
Findings
Achieves state-of-the-art text-to-image alignment on Stable Diffusion 3
Demonstrates robustness to unseen human preference models
Maintains generative quality and diversity during fine-tuning
Abstract
Flow-based generative models have shown remarkable success in text-to-image generation, yet fine-tuning them with intermediate feedback remains challenging, especially for continuous-time flow matching models. Most existing approaches solely learn from outcome rewards, struggling with the credit assignment problem. Alternative methods that attempt to learn a critic via direct regression on cumulative rewards often face training instabilities and model collapse in online settings. We present AC-Flow, a robust actor-critic framework that addresses these challenges through three key innovations: (1) reward shaping that provides well-normalized learning signals to enable stable intermediate value learning and gradient control, (2) a novel dual-stability mechanism that combines advantage clipping to prevent destructive policy updates with a warm-up phase that allows the critic to mature…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Multimodal Machine Learning Applications · Artificial Intelligence in Games
