Extremum Flow Matching for Offline Goal Conditioned Reinforcement Learning

Quentin Rouxel (CUHK); Clemente Donoso; Fei Chen (CUHK); Serena Ivaldi; Jean-Baptiste Mouret

arXiv:2505.19717·cs.RO·August 21, 2025

Extremum Flow Matching for Offline Goal Conditioned Reinforcement Learning

Quentin Rouxel (CUHK), Clemente Donoso, Fei Chen (CUHK), Serena Ivaldi, Jean-Baptiste Mouret

PDF

TL;DR

This paper introduces a novel goal-conditioned reinforcement learning method using Flow Matching to estimate distribution extrema, enabling humanoid robots to perform complex manipulation tasks from diverse demonstrations.

Contribution

It develops a new approach leveraging Flow Matching's properties to improve goal-conditioned imitation and reinforcement learning, validated on both benchmark and real humanoid robot tasks.

Findings

01

Effective in diverse demonstration scenarios

02

Successful deployment on humanoid robot for complex tasks

03

Improved performance over existing methods

Abstract

Imitation learning is a promising approach for enabling generalist capabilities in humanoid robots, but its scaling is fundamentally constrained by the scarcity of high-quality expert demonstrations. This limitation can be mitigated by leveraging suboptimal, open-ended play data, often easier to collect and offering greater diversity. This work builds upon recent advances in generative modeling, specifically Flow Matching, an alternative to Diffusion models. We introduce a method for estimating the minimum or maximum of the learned distribution by leveraging the unique properties of Flow Matching, namely, deterministic transport and support for arbitrary source distributions. We apply this method to develop several goal-conditioned imitation and reinforcement learning algorithms based on Flow Matching, where policies are conditioned on both current and goal observations. We explore and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.