Flow-Factory: A Unified Framework for Reinforcement Learning in Flow-Matching Models

Bowen Ping; Chengyou Jia; Minnan Luo; Hangwei Qian; Ivor Tsang

arXiv:2602.12529·cs.LG·March 17, 2026

Flow-Factory: A Unified Framework for Reinforcement Learning in Flow-Matching Models

Bowen Ping, Chengyou Jia, Minnan Luo, Hangwei Qian, Ivor Tsang

PDF

Open Access

TL;DR

Flow-Factory is a modular, unified framework that simplifies reinforcement learning in flow-matching models, enabling easy integration, rapid prototyping, and scalable deployment for diverse architectures and algorithms.

Contribution

It introduces a flexible, registry-based architecture that decouples algorithms, models, and rewards, facilitating seamless integration and rapid development in reinforcement learning for flow-matching models.

Findings

01

Supports multiple algorithms and architectures like GRPO, DiffusionNFT, AWM

02

Provides production-ready memory optimization and distributed training

03

Enables flexible multi-reward training and rapid prototyping

Abstract

Reinforcement learning has emerged as a promising paradigm for aligning diffusion and flow-matching models with human preferences, yet practitioners face fragmented codebases, model-specific implementations, and engineering complexity. We introduce Flow-Factory, a unified framework that decouples algorithms, models, and rewards through through a modular, registry-based architecture. This design enables seamless integration of new algorithms and architectures, as demonstrated by our support for GRPO, DiffusionNFT, and AWM across Flux, Qwen-Image, and WAN video models. By minimizing implementation overhead, Flow-Factory empowers researchers to rapidly prototype and scale future innovations with ease. Flow-Factory provides production-ready memory optimization, flexible multi-reward training, and seamless distributed training support. The codebase is available at…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Stochastic Gradient Optimization Techniques · Human Pose and Action Recognition