Robust Reinforcement Learning for Risk-Sensitive Linear Quadratic Gaussian Control
Leilei Cui, Tamer Ba\c{s}ar, and Zhong-Ping Jiang

TL;DR
This paper introduces a robust reinforcement learning framework for risk-sensitive linear quadratic Gaussian control, ensuring stability and convergence even with model mismatch and disturbances, applicable to both known and unknown system dynamics.
Contribution
It develops a dual-loop policy optimization algorithm with proven convergence and robustness properties, and a novel model-free off-policy method for unknown dynamics.
Findings
Algorithm converges globally and uniformly.
Robust against disturbances during learning.
Effective in numerical simulations.
Abstract
This paper proposes a novel robust reinforcement learning framework for discrete-time linear systems with model mismatch that may arise from the sim-to-real gap. A key strategy is to invoke advanced techniques from control theory. Using the formulation of the classical risk-sensitive linear quadratic Gaussian control, a dual-loop policy optimization algorithm is proposed to generate a robust optimal controller. The dual-loop policy optimization algorithm is shown to be globally and uniformly convergent, and robust against disturbances during the learning process. This robustness property is called small-disturbance input-to-state stability and guarantees that the proposed policy optimization algorithm converges to a small neighborhood of the optimal controller as long as the disturbance at each learning step is relatively small. In addition, when the system dynamics is unknown, a novel…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Control Systems Optimization
