Inverse Optimal Control with Discount Factor for Continuous and   Discrete-Time Control-Affine Systems and Reinforcement Learning

Luis Rodrigues

arXiv:2211.09917·math.OC·November 21, 2022

Inverse Optimal Control with Discount Factor for Continuous and Discrete-Time Control-Affine Systems and Reinforcement Learning

Luis Rodrigues

PDF

Open Access

TL;DR

This paper develops methods for inverse optimal control in control-affine systems, deriving solutions for quadratic value functions and connecting the problem to reinforcement learning, applicable to both discrete and continuous-time systems.

Contribution

It introduces a novel approach to inverse optimal control for control-affine systems with quadratic costs, including a mapping to reinforcement learning frameworks.

Findings

01

Optimal control law as a regularized least squares solution.

02

Conditions for linear optimal control law.

03

Application examples demonstrating theoretical results.

Abstract

This paper addresses the inverse optimal control problem of finding the state weighting function that leads to a quadratic value function when the cost on the input is fixed to be quadratic. The paper focuses on a class of infinite horizon discrete-time and continuous-time optimal control problems whose dynamics are control-affine and whose cost is quadratic in the input. The optimal control policy for this problem is the projection of minus the gradient of the value function onto the space formed by all feasible control directions. This projection points along the control direction of steepest decrease of the value function. For discrete-time systems and a quadratic value function the optimal control law can be obtained as the solution of a regularized least squares program, which corresponds to a receding horizon control with a single step ahead. For the single input case and a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdaptive Dynamic Programming Control · Advanced Control Systems Optimization