Value-Gradient based Formulation of Optimal Control Problem and Machine Learning Algorithm
Alain Bensoussan, Jiayue Han, Sheung Chi Phillip Yam, Xiang Zhou

TL;DR
This paper introduces a novel value-gradient formulation for optimal control, combining PDE-based iterative schemes with machine learning to enhance accuracy, efficiency, and robustness in solving continuous-time deterministic problems.
Contribution
It proposes a new value-gradient PDE formulation, an efficient parallel iterative scheme, and a machine learning approach to improve optimal control solutions.
Findings
Significantly improves accuracy of control estimates.
Enhances efficiency and robustness with less data.
Converges linearly in weighted $L_\alpha^2$ sense.
Abstract
Optimal control problem is typically solved by first finding the value function through Hamilton-Jacobi equation (HJE) and then taking the minimizer of the Hamiltonian to obtain the control. In this work, instead of focusing on the value function, we propose a new formulation for the gradient of the value function (value-gradient) as a decoupled system of partial differential equations in the context of continuous-time deterministic discounted optimal control problem. We develop an efficient iterative scheme for this system of equations in parallel by utilizing the properties that they share the same characteristic curves as the HJE for the value function. For the theoretical part, we prove that this iterative scheme converges linearly in sense for some suitable exponent in a weight function. For the numerical method, we combine characteristic line method with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModel Reduction and Neural Networks · Advanced Numerical Methods in Computational Mathematics · Advanced Control Systems Optimization
