Hamilton-Jacobi-Bellman Equations for Maximum Entropy Optimal Control
Jeongho Kim, Insoon Yang

TL;DR
This paper extends maximum entropy reinforcement learning to continuous-time deterministic control problems by deriving novel Hamilton-Jacobi-Bellman equations, providing computationally tractable solutions and characterizing optimal controls explicitly.
Contribution
It introduces a new class of HJB equations for maximum entropy control in continuous time, proving their properties and linking to explicit solutions in special cases.
Findings
HJB equations correspond to unique viscosity solutions.
Maximum entropy formulation improves solution regularity.
Explicit solutions for linear-quadratic problems and Gaussian controls.
Abstract
Maximum entropy reinforcement learning (RL) methods have been successfully applied to a range of challenging sequential decision-making and control tasks. However, most of existing techniques are designed for discrete-time systems. As a first step toward their extension to continuous-time systems, this paper considers continuous-time deterministic optimal control problems with entropy regularization. Applying the dynamic programming principle, we derive a novel class of Hamilton-Jacobi-Bellman (HJB) equations and prove that the optimal value function of the maximum entropy control problem corresponds to the unique viscosity solution of the HJB equation. Our maximum entropy formulation is shown to enhance the regularity of the viscosity solution and to be asymptotically consistent as the effect of entropy regularization diminishes. A salient feature of the HJB equations is computational…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdaptive Dynamic Programming Control · Advanced Control Systems Optimization · Advanced Thermodynamics and Statistical Mechanics
