Unifying Hamilton-Jacobi Reachability and Reinforcement Learning
Prashant Solanki, Isabelle El-Hajj, Jasper van Beers, Erik-Jan van Kampen, Coen de Visser

TL;DR
This paper unifies Hamilton-Jacobi reachability and Reinforcement Learning through a new cost formulation, proving convergence of RL value iteration to HJB solutions and validating with experiments.
Contribution
It introduces a novel framework connecting HJ reachability with RL, ensuring safety semantics are preserved in RL algorithms.
Findings
RL value iteration converges to HJB solutions
Experiments show learned value functions approximate semi-Lagrangian HJB solutions
Framework maintains reachability-based safety in deep RL implementations
Abstract
We unify Hamilton-Jacobi (HJ) reachability and Reinforcement Learning (RL) through a proposed running cost formulation. We prove that the resultant travel-cost value function is the unique bounded viscosity solution of a time-dependent Hamilton-Jacobi Bellman (HJB) Partial Differential Equation (PDE) with zero terminal data, whose negative sublevel set equals the strict backward-reachable tube. Using a forward reparameterization and a contraction inducing Bellman update, we show that fixed points of small-step RL value iteration converge to the viscosity solution of the forward discounted HJB. Experiments on a classical benchmark validate this connection by demonstrating convergence of learned value functions toward semi-Lagrangian HJB solutions and by quantifying approximation error across the state space. These results empirically support the theoretical analysis, showing that the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
