Robust Reinforcement Learning for Risk-Sensitive Linear Quadratic   Gaussian Control

Leilei Cui; Tamer Ba\c{s}ar; and Zhong-Ping Jiang

arXiv:2212.02072·eess.SY·December 7, 2023

Robust Reinforcement Learning for Risk-Sensitive Linear Quadratic Gaussian Control

Leilei Cui, Tamer Ba\c{s}ar, and Zhong-Ping Jiang

PDF

Open Access

TL;DR

This paper introduces a robust reinforcement learning framework for risk-sensitive linear quadratic Gaussian control, ensuring stability and convergence even with model mismatch and disturbances, applicable to both known and unknown system dynamics.

Contribution

It develops a dual-loop policy optimization algorithm with proven convergence and robustness properties, and a novel model-free off-policy method for unknown dynamics.

Findings

01

Algorithm converges globally and uniformly.

02

Robust against disturbances during learning.

03

Effective in numerical simulations.

Abstract

This paper proposes a novel robust reinforcement learning framework for discrete-time linear systems with model mismatch that may arise from the sim-to-real gap. A key strategy is to invoke advanced techniques from control theory. Using the formulation of the classical risk-sensitive linear quadratic Gaussian control, a dual-loop policy optimization algorithm is proposed to generate a robust optimal controller. The dual-loop policy optimization algorithm is shown to be globally and uniformly convergent, and robust against disturbances during the learning process. This robustness property is called small-disturbance input-to-state stability and guarantees that the proposed policy optimization algorithm converges to a small neighborhood of the optimal controller as long as the disturbance at each learning step is relatively small. In addition, when the system dynamics is unknown, a novel…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Control Systems Optimization