ScoRe-Flow: Complete Distributional Control via Score-Based Reinforcement Learning for Flow Matching
Xiaotian Qiu, Lukai Chen, Jinhao Li, Qi Sun, Cheng Zhuo, Guohao Dai

TL;DR
ScoRe-Flow introduces a score-based reinforcement learning method for flow matching policies, enabling faster convergence and improved success rates in robotic control tasks by modulating the drift with the score function.
Contribution
It presents a novel score-based RL fine-tuning approach that decouples control over mean and variance, enhancing training efficiency and performance.
Findings
ScoRe-Flow converges 2.4x faster than state-of-the-art methods on locomotion tasks.
Achieves up to 5.4% higher success rates on manipulation benchmarks.
Utilizes a closed-form score function from the velocity field, avoiding auxiliary networks.
Abstract
Flow Matching (FM) policies have emerged as an efficient backbone for robotic control, offering fast and expressive action generation that underpins recent large-scale embodied AI systems. However, FM policies trained via imitation learning inherit the limitations of demonstration data; surpassing suboptimal behaviors requires reinforcement learning (RL) fine-tuning. Recent methods convert deterministic flows into stochastic differential equations (SDEs) with learnable noise injection, enabling exploration and tractable likelihoods, but such noise-only control can compromise training efficiency when demonstrations already provide strong priors. We observe that modulating the drift via the score function, i.e., the gradient of log-density, steers exploration toward high-probability regions, improving stability. The score admits a closed-form expression from the velocity field, requiring…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
