Neural Actor-Critic Methods for Hamilton-Jacobi-Bellman PDEs: Asymptotic Analysis and Numerical Studies

Samuel N. Cohen; Jackson Hebner; Deqing Jiang; Justin Sirignano

arXiv:2507.06428·math.OC·May 20, 2026

Neural Actor-Critic Methods for Hamilton-Jacobi-Bellman PDEs: Asymptotic Analysis and Numerical Studies

Samuel N. Cohen, Jackson Hebner, Deqing Jiang, Justin Sirignano

PDF

TL;DR

This paper introduces a neural actor-critic algorithm for high-dimensional Hamilton-Jacobi-Bellman equations, providing convergence guarantees and demonstrating high-dimensional problem-solving capabilities.

Contribution

The paper offers a novel neural actor-critic framework with boundary-condition satisfaction and convergence analysis for solving high-dimensional HJB PDEs, including numerical validation up to 200 dimensions.

Findings

01

Algorithm accurately solves stochastic control problems up to 200 dimensions.

02

Convergence of actor-critic networks to an infinite-dimensional ODE is established.

03

Numerical experiments include complex problems with non-convex Hamiltonians.

Abstract

We mathematically analyze and numerically study an actor-critic machine learning algorithm for solving high-dimensional Hamilton-Jacobi-Bellman (HJB) partial differential equations from stochastic control theory. The architecture of the critic (the estimator for the value function) is structured so that the boundary condition is always perfectly satisfied (rather than being included in the training loss) and utilizes a biased gradient which reduces computational cost. The actor (the estimator for the optimal control) is trained by minimizing the integral of the Hamiltonian over the domain, where the Hamiltonian is estimated using the critic. We show that the training dynamics of the actor and critic neural networks converge in a Sobolev-type space to a certain infinite-dimensional ordinary differential equation (ODE) as the number of hidden units in the actor and critic $\rightarrow…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsModel Reduction and Neural Networks · Adaptive Dynamic Programming Control · Reinforcement Learning in Robotics