Learning-Based Adaptive Optimal Control of Linear Time-Delay Systems: A   Policy Iteration Approach

Leilei Cui; Bo Pang; Zhong-Ping Jiang

arXiv:2210.00204·eess.SY·October 4, 2022·1 cites

Learning-Based Adaptive Optimal Control of Linear Time-Delay Systems: A Policy Iteration Approach

Leilei Cui, Bo Pang, Zhong-Ping Jiang

PDF

Open Access

TL;DR

This paper introduces a reinforcement learning-based policy iteration method for adaptive optimal control of linear time-delay systems, enabling model-free control design validated in practical applications.

Contribution

It develops a novel data-driven policy iteration algorithm for infinite-dimensional Riccati equations in time-delay systems, even without exact model knowledge.

Findings

01

Effective control of time-delay systems demonstrated in metal cutting applications.

02

Algorithm converges to optimal control policies using finite data samples.

03

Validated in autonomous driving scenarios.

Abstract

This paper studies the adaptive optimal control problem for a class of linear time-delay systems described by delay differential equations (DDEs). A crucial strategy is to take advantage of recent developments in reinforcement learning and adaptive dynamic programming and develop novel methods to learn adaptive optimal controllers from finite samples of input and state data. In this paper, the data-driven policy iteration (PI) is proposed to solve the infinite-dimensional algebraic Riccati equation (ARE) iteratively in the absence of exact model knowledge. Interestingly, the proposed recursive PI algorithm is new in the present context of continuous-time time-delay systems, even when the model knowledge is assumed known. The efficacy of the proposed learning-based control methods is validated by means of practical applications arising from metal cutting and autonomous driving.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdaptive Dynamic Programming Control