Learning-based Hamilton-Jacobi-Bellman Methods for Optimal Control
Sixiong You, Ran Dai, and Ping Lu

TL;DR
This paper introduces two learning-based methods, supervised and reinforcement learning, to efficiently solve Hamilton-Jacobi-Bellman equations in optimal control problems by estimating initial adjoint variables in real-time.
Contribution
It presents novel learning-based approaches for solving HJB equations, enabling real-time initial guess estimation and addressing cases without pre-existing solution databases.
Findings
Supervised learning effectively predicts initial adjoint variables with available solution data.
Reinforcement learning learns to solve HJB equations without prior solution datasets.
Both methods improve real-time solution capabilities for optimal control problems.
Abstract
Many optimal control problems are formulated as two point boundary value problems (TPBVPs) with conditions of optimality derived from the Hamilton-Jacobi-Bellman (HJB) equations. In most cases, it is challenging to solve HJBs due to the difficulty of guessing the adjoint variables. This paper proposes two learning-based approaches to find the initial guess of adjoint variables in real-time, which can be applied to solve general TPBVPs. For cases with database of solutions and corresponding adjoint variables of a TPBVP under varying boundary conditions, a supervised learning method is applied to learn the HJB solutions off-line. After obtaining a trained neural network from supervised learning, we are able to find proper initial adjoint variables for given boundary conditions in real-time. However, when validated solutions of TPBVPs are not available, the reinforcement learning method is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdaptive Dynamic Programming Control · Reinforcement Learning in Robotics · Frequency Control in Power Systems
