Learning-based Hamilton-Jacobi-Bellman Methods for Optimal Control

Sixiong You; Ran Dai; and Ping Lu

arXiv:1907.10097·math.OC·July 25, 2019·1 cites

Learning-based Hamilton-Jacobi-Bellman Methods for Optimal Control

Sixiong You, Ran Dai, and Ping Lu

PDF

Open Access

TL;DR

This paper introduces two learning-based methods, supervised and reinforcement learning, to efficiently solve Hamilton-Jacobi-Bellman equations in optimal control problems by estimating initial adjoint variables in real-time.

Contribution

It presents novel learning-based approaches for solving HJB equations, enabling real-time initial guess estimation and addressing cases without pre-existing solution databases.

Findings

01

Supervised learning effectively predicts initial adjoint variables with available solution data.

02

Reinforcement learning learns to solve HJB equations without prior solution datasets.

03

Both methods improve real-time solution capabilities for optimal control problems.

Abstract

Many optimal control problems are formulated as two point boundary value problems (TPBVPs) with conditions of optimality derived from the Hamilton-Jacobi-Bellman (HJB) equations. In most cases, it is challenging to solve HJBs due to the difficulty of guessing the adjoint variables. This paper proposes two learning-based approaches to find the initial guess of adjoint variables in real-time, which can be applied to solve general TPBVPs. For cases with database of solutions and corresponding adjoint variables of a TPBVP under varying boundary conditions, a supervised learning method is applied to learn the HJB solutions off-line. After obtaining a trained neural network from supervised learning, we are able to find proper initial adjoint variables for given boundary conditions in real-time. However, when validated solutions of TPBVPs are not available, the reinforcement learning method is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdaptive Dynamic Programming Control · Reinforcement Learning in Robotics · Frequency Control in Power Systems