Concurrent learning-based approximate optimal regulation

Rushikesh Kamalapurkar; Patrick Walters; Warren Dixon

arXiv:1304.3477·cs.SY·July 25, 2017

Concurrent learning-based approximate optimal regulation

Rushikesh Kamalapurkar, Patrick Walters, Warren Dixon

PDF

TL;DR

This paper introduces a concurrent learning-based method for optimal control in deterministic systems that removes the need for persistent excitation, ensuring convergence without restrictive conditions.

Contribution

It proposes a novel concurrent learning approach that eliminates the PE condition in online approximate optimal regulation, with proven convergence guarantees.

Findings

01

UUB convergence of system states to the origin

02

UUB convergence of the policy to the optimal policy

03

Simulation results demonstrating effective control performance

Abstract

In deterministic systems, reinforcement learning-based online approximate optimal control methods typically require a restrictive persistence of excitation (PE) condition for convergence. This paper presents a concurrent learning-based solution to the online approximate optimal regulation problem that eliminates the need for PE. The development is based on the observation that given a model of the system, the Bellman error, which quantifies the deviation of the system Hamiltonian from the optimal Hamiltonian, can be evaluated at any point in the state space. Further, a concurrent learning-based parameter identifier is developed to compensate for parametric uncertainty in the plant dynamics. Uniformly ultimately bounded (UUB) convergence of the system states to the origin, and UUB convergence of the developed policy to the optimal policy are established using a Lyapunov-based analysis,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.