Epigraph-Guided Flow Matching for Safe and Performant Offline Reinforcement Learning
Manan Tayal, Mumuksh Tayal

TL;DR
EpiFlow introduces a novel offline RL framework that jointly optimizes safety and performance by using an epigraph reformulation and flow matching, resulting in safe, high-performing policies in safety-critical tasks.
Contribution
The paper proposes Epigraph-Guided Flow Matching (EpiFlow), a new method that formulates safe offline RL as a state-constrained optimal control problem with a feasibility value function.
Findings
Achieves competitive returns on safety-critical benchmarks.
Maintains near-zero safety violations in experiments.
Effectively balances safety and reward optimization.
Abstract
Offline reinforcement learning (RL) provides a compelling paradigm for training autonomous systems without the risks of online exploration, particularly in safety-critical domains. However, jointly achieving strong safety and performance from fixed datasets remains challenging. Existing safe offline RL methods often rely on soft constraints that allow violations, introduce excessive conservatism, or struggle to balance safety, reward optimization, and adherence to the data distribution. To address this, we propose Epigraph-Guided Flow Matching (EpiFlow), a framework that formulates safe offline RL as a state-constrained optimal control problem to co-optimize safety and performance. We learn a feasibility value function derived from an epigraph reformulation of the optimal control problem, thereby avoiding the decoupled objectives or post-hoc filtering common in prior work. Policies are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Adaptive Dynamic Programming Control
