Stabilized neural Hamilton--Jacobi--Bellman solvers: Error analysis and applications in model-based reinforcement learning

Minseok Kim; Yeongjong Kim; Namkyeong Cho; Yeoneung Kim

arXiv:2605.07116·cs.LG·May 11, 2026

Stabilized neural Hamilton--Jacobi--Bellman solvers: Error analysis and applications in model-based reinforcement learning

Minseok Kim, Yeongjong Kim, Namkyeong Cho, Yeoneung Kim

PDF

TL;DR

This paper develops an error analysis framework for neural network-based Hamilton--Jacobi--Bellman solvers in continuous-time reinforcement learning, providing stability bounds and empirical validation across various control benchmarks.

Contribution

It introduces a novel error theory for hybrid neural HJB solvers, including stability estimates and error bounds, bridging grid-based and PINN approaches in model-based RL.

Findings

01

Stability bounds separate residual, mismatch, and model errors.

02

Finite-sample collocation certificate guarantees error control.

03

Experiments demonstrate the method's effectiveness in high-dimensional control tasks.

Abstract

Physics-informed neural solvers offer a promising route to model-based reinforcement learning in continuous time, where optimal feedback synthesis is governed by Hamilton--Jacobi--Bellman (HJB) equations. Practical implementations often occupy a regime that is neither a classical grid method nor a continuous-PDE PINN: the value function is represented by a neural network, finite-difference HJB policy-evaluation operators are evaluated by network queries at shifted points, and residuals are minimized by random continuous collocation. This regime preserves the stabilized finite-difference policy-evaluation structure while avoiding grid-based value unknowns. We develop an error theory for this hybrid regime. Interpreting finite differences as shift operators acting on neural networks, we prove a population $L^{2}$ stability estimate for one policy-evaluation step with learned dynamics. The…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.