Graph-GRPO: Training Graph Flow Models with Reinforcement Learning

Baoheng Zhu; Deyu Bo; Delvin Ce Zhang; Xiao Wang

arXiv:2603.10395·cs.LG·March 12, 2026

Graph-GRPO: Training Graph Flow Models with Reinforcement Learning

Baoheng Zhu, Deyu Bo, Delvin Ce Zhang, Xiao Wang

PDF

Open Access

TL;DR

Graph-GRPO introduces a reinforcement learning framework for training graph flow models with verifiable rewards, improving graph generation quality and efficiency, especially in molecular optimization tasks.

Contribution

It derives an analytical transition probability for GFMs and proposes a localized exploration strategy, enabling effective RL training and self-improvement.

Findings

01

Achieves 95.0% validity on synthetic datasets.

02

Attains state-of-the-art results in molecular optimization.

03

Uses only 50 denoising steps for high-quality generation.

Abstract

Graph generation is a fundamental task with broad applications, such as drug discovery. Recently, discrete flow matching-based graph generation, \aka, graph flow model (GFM), has emerged due to its superior performance and flexible sampling. However, effectively aligning GFMs with complex human preferences or task-specific objectives remains a significant challenge. In this paper, we propose Graph-GRPO, an online reinforcement learning (RL) framework for training GFMs under verifiable rewards. Our method makes two key contributions: (1) We derive an analytical expression for the transition probability of GFMs, replacing the Monte Carlo sampling and enabling fully differentiable rollouts for RL training; (2) We propose a refinement strategy that randomly perturbs specific nodes and edges in a graph, and regenerates them, allowing for localized exploration and self-improvement of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Graph Neural Networks · Graph Theory and Algorithms · Mobile Crowdsensing and Crowdsourcing