Variable Gain Gradient Descent-based Reinforcement Learning for Robust   Optimal Tracking Control of Uncertain Nonlinear System with Input-Constraints

Amardeep Mishra; Satadal Ghosh

arXiv:1911.04157·eess.SY·June 16, 2020·1 cites

Variable Gain Gradient Descent-based Reinforcement Learning for Robust Optimal Tracking Control of Uncertain Nonlinear System with Input-Constraints

Amardeep Mishra, Satadal Ghosh

PDF

Open Access

TL;DR

This paper introduces a variable gain gradient descent method in reinforcement learning to enhance convergence speed and stability in controlling uncertain nonlinear systems with input constraints.

Contribution

It proposes a novel critic neural network tuning law with variable gain gradient descent that adapts learning rates based on HJB error, improving convergence and stability.

Findings

01

Faster convergence of critic neural network weights.

02

Tighter residual set for system trajectories.

03

Validated robustness through numerical simulations.

Abstract

In recent times, a variety of Reinforcement Learning (RL) algorithms have been proposed for optimal tracking problem of continuous time nonlinear systems with input constraints. Most of these algorithms are based on the notion of uniform ultimate boundedness (UUB) stability, in which normally higher learning rates are avoided in order to restrict oscillations in state error to smaller values. However, this comes at the cost of higher convergence time of critic neural network weights. This paper addresses that problem by proposing a novel tuning law containing a variable gain gradient descent for critic neural network that can adjust the learning rate based on Hamilton-Jacobi-Bellman (HJB) error. By allowing high learning rate the proposed variable gain gradient descent tuning law could improve the convergence time of critic neural network weights. Simultaneously, it also results in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdaptive Dynamic Programming Control