Reinforcement Learning for Adaptive Optimal Stationary Control of Linear   Stochastic Systems

Bo Pang; Zhong-Ping Jiang

arXiv:2107.07788·eess.SY·December 7, 2021·1 cites

Reinforcement Learning for Adaptive Optimal Stationary Control of Linear Stochastic Systems

Bo Pang, Zhong-Ping Jiang

PDF

Open Access 2 Repos

TL;DR

This paper introduces a new reinforcement learning algorithm for adaptive optimal control of linear stochastic systems, which converges to near-optimal policies directly from data without system identification.

Contribution

A novel off-policy reinforcement learning algorithm based on optimistic least-squares policy iteration for continuous-time stochastic systems is proposed, enabling direct data-driven control policy optimization.

Findings

01

Algorithm converges to a neighborhood of the optimal solution with probability one.

02

Validated on a triple inverted pendulum example demonstrating feasibility.

03

Effective in handling systems with both additive and multiplicative noises.

Abstract

This paper studies the adaptive optimal stationary control of continuous-time linear stochastic systems with both additive and multiplicative noises, using reinforcement learning techniques. Based on policy iteration, a novel off-policy reinforcement learning algorithm, named optimistic least-squares-based policy iteration, is proposed which is able to find iteratively near-optimal policies of the adaptive optimal stationary control problem directly from input/state data without explicitly identifying any system matrices, starting from an initial admissible control policy. The solutions given by the proposed optimistic least-squares-based policy iteration are proved to converge to a small neighborhood of the optimal solution with probability one, under mild conditions. The application of the proposed algorithm to a triple inverted pendulum example validates its feasibility and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdaptive Dynamic Programming Control · Smart Grid Energy Management · Energy Load and Power Forecasting