Hamilton-Jacobi-Bellman Equations for Maximum Entropy Optimal Control

Jeongho Kim; Insoon Yang

arXiv:2009.13097·math.OC·September 29, 2020·5 cites

Hamilton-Jacobi-Bellman Equations for Maximum Entropy Optimal Control

Jeongho Kim, Insoon Yang

PDF

Open Access

TL;DR

This paper extends maximum entropy reinforcement learning to continuous-time deterministic control problems by deriving novel Hamilton-Jacobi-Bellman equations, providing computationally tractable solutions and characterizing optimal controls explicitly.

Contribution

It introduces a new class of HJB equations for maximum entropy control in continuous time, proving their properties and linking to explicit solutions in special cases.

Findings

01

HJB equations correspond to unique viscosity solutions.

02

Maximum entropy formulation improves solution regularity.

03

Explicit solutions for linear-quadratic problems and Gaussian controls.

Abstract

Maximum entropy reinforcement learning (RL) methods have been successfully applied to a range of challenging sequential decision-making and control tasks. However, most of existing techniques are designed for discrete-time systems. As a first step toward their extension to continuous-time systems, this paper considers continuous-time deterministic optimal control problems with entropy regularization. Applying the dynamic programming principle, we derive a novel class of Hamilton-Jacobi-Bellman (HJB) equations and prove that the optimal value function of the maximum entropy control problem corresponds to the unique viscosity solution of the HJB equation. Our maximum entropy formulation is shown to enhance the regularity of the viscosity solution and to be asymptotically consistent as the effect of entropy regularization diminishes. A salient feature of the HJB equations is computational…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdaptive Dynamic Programming Control · Advanced Control Systems Optimization · Advanced Thermodynamics and Statistical Mechanics