Robust Policy Optimization in Continuous-time Mixed   $\mathcal{H}_2/\mathcal{H}_\infty$ Stochastic Control

Leilei Cui; Lekan Molu

arXiv:2209.04477·eess.SY·June 30, 2023

Robust Policy Optimization in Continuous-time Mixed $\mathcal{H}_2/\mathcal{H}_\infty$ Stochastic Control

Leilei Cui, Lekan Molu

PDF

Open Access

TL;DR

This paper develops robust policy optimization algorithms for continuous-time stochastic control systems, combining $ ext{H}_2$ and $ ext{H}_ ext{infty}$ norms to ensure stability and robustness, with proven convergence guarantees.

Contribution

It introduces both model-based and model-free RL algorithms for continuous-time stochastic control, incorporating robustness metrics and stability guarantees in a complex dynamical setting.

Findings

01

Proposed algorithms achieve stability and robustness in continuous-time stochastic control.

02

Rigorous convergence guarantees for model-based policy optimization.

03

Effective handling of unknown nonlinear dynamics in a stochastic setting.

Abstract

Following the recent resurgence in establishing linear control theoretic benchmarks for reinforcement leaning (RL)-based policy optimization (PO) for complex dynamical systems with continuous state and action spaces, an optimal control problem for a continuous-time infinite-dimensional linear stochastic system possessing additive Brownian motion is optimized on a cost that is an exponent of the quadratic form of the state, input, and disturbance terms. We lay out a model-based and model-free algorithm for RL-based stochastic PO. For the model-based algorithm, we establish rigorous convergence guarantees. For the sampling-based algorithm, over trajectory arcs that emanate from the phase space, we find that the Hamilton-Jacobi Bellman equation parameterizes trajectory costs -- resulting in a discrete-time (input and state-based) sampling scheme accompanied by unknown nonlinear dynamics…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic processes and financial applications · Risk and Portfolio Optimization · Reinforcement Learning in Robotics