Off Policy Risk Sensitive Reinforcement Learning Based Optimal Tracking   Control with Prescribe Performances

C. Li; Y. Wang; F. Liu; and M.Buss

arXiv:2009.00476·eess.SY·September 2, 2020

Off Policy Risk Sensitive Reinforcement Learning Based Optimal Tracking Control with Prescribe Performances

C. Li, Y. Wang, F. Liu, and M.Buss

PDF

Open Access

TL;DR

This paper introduces an off-policy reinforcement learning control method that ensures prescribed performance in optimal tracking tasks, using risk-sensitive penalties and experience data for stable, practical implementation.

Contribution

It develops a novel off-policy RL framework with risk-sensitive constraints and experience-based critic learning, guaranteeing stability and convergence for prescribed performance tracking.

Findings

01

Proposed method achieves prescribed performance during learning.

02

Guarantees critic weight convergence without external excitation.

03

Simulation confirms effectiveness of the control strategy.

Abstract

An off policy reinforcement learning based control strategy is developed for the optimal tracking control problem to achieve the prescribed performance of full states during the learning process. The optimal tracking control problem is converted as an optimal regulation problem based on an auxiliary system. The requirements of prescribed performances are transformed into constraint satisfaction problems that are dealt with by risk sensitive state penalty terms under an optimization framework. To get approximated solutions of the Hamilton Jacobi Bellman equation, an off policy adaptive critic learning architecture is developed by using current data and experience data together. By using experience data, the proposed weight estimation update law of the critic learning agent guarantees weight convergence to the actual value. This technique enjoys practicability comparing with common…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdaptive Dynamic Programming Control · Adaptive Control of Nonlinear Systems · Extremum Seeking Control Systems