Average Cost Optimal Control of Stochastic Systems Using Reinforcement   Learning

Jing Lai; Junlin Xiong

arXiv:2010.06236·eess.SY·October 14, 2020

Average Cost Optimal Control of Stochastic Systems Using Reinforcement Learning

Jing Lai, Junlin Xiong

PDF

Open Access

TL;DR

This paper develops an online reinforcement learning method for optimal control of stochastic systems aiming to minimize average costs, with proven convergence to optimal solutions and demonstrated effectiveness through a numerical example.

Contribution

It introduces a novel online, model-free RL algorithm that estimates the Q-function kernel and control gain for stochastic systems with convergence guarantees.

Findings

01

Convergence of control gain and kernel matrix to optimal values

02

Effective online learning algorithm demonstrated via numerical example

03

Applicable to systems with multiplicative and additive noise

Abstract

This paper addresses the average cost minimization problem for discrete-time systems with multiplicative and additive noises via reinforcement learning. By using Q-function, we propose an online learning scheme to estimate the kernel matrix of Q-function and to update the control gain using the data along the system trajectories. The obtained control gain and kernel matrix are proved to converge to the optimal ones. To implement the proposed learning scheme, an online model-free reinforcement learning algorithm is given, where recursive least squares method is used to estimate the kernel matrix of Q-function. A numerical example is presented to illustrate the proposed approach.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdaptive Dynamic Programming Control · Reinforcement Learning in Robotics · Advanced Control Systems Optimization