Inverse Optimal Control with Discount Factor for Continuous and Discrete-Time Control-Affine Systems and Reinforcement Learning
Luis Rodrigues

TL;DR
This paper develops methods for inverse optimal control in control-affine systems, deriving solutions for quadratic value functions and connecting the problem to reinforcement learning, applicable to both discrete and continuous-time systems.
Contribution
It introduces a novel approach to inverse optimal control for control-affine systems with quadratic costs, including a mapping to reinforcement learning frameworks.
Findings
Optimal control law as a regularized least squares solution.
Conditions for linear optimal control law.
Application examples demonstrating theoretical results.
Abstract
This paper addresses the inverse optimal control problem of finding the state weighting function that leads to a quadratic value function when the cost on the input is fixed to be quadratic. The paper focuses on a class of infinite horizon discrete-time and continuous-time optimal control problems whose dynamics are control-affine and whose cost is quadratic in the input. The optimal control policy for this problem is the projection of minus the gradient of the value function onto the space formed by all feasible control directions. This projection points along the control direction of steepest decrease of the value function. For discrete-time systems and a quadratic value function the optimal control law can be obtained as the solution of a regularized least squares program, which corresponds to a receding horizon control with a single step ahead. For the single input case and a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdaptive Dynamic Programming Control · Advanced Control Systems Optimization
