Robust Policy Optimization in Continuous-time Mixed $\mathcal{H}_2/\mathcal{H}_\infty$ Stochastic Control
Leilei Cui, Lekan Molu

TL;DR
This paper develops robust policy optimization algorithms for continuous-time stochastic control systems, combining $ ext{H}_2$ and $ ext{H}_ ext{infty}$ norms to ensure stability and robustness, with proven convergence guarantees.
Contribution
It introduces both model-based and model-free RL algorithms for continuous-time stochastic control, incorporating robustness metrics and stability guarantees in a complex dynamical setting.
Findings
Proposed algorithms achieve stability and robustness in continuous-time stochastic control.
Rigorous convergence guarantees for model-based policy optimization.
Effective handling of unknown nonlinear dynamics in a stochastic setting.
Abstract
Following the recent resurgence in establishing linear control theoretic benchmarks for reinforcement leaning (RL)-based policy optimization (PO) for complex dynamical systems with continuous state and action spaces, an optimal control problem for a continuous-time infinite-dimensional linear stochastic system possessing additive Brownian motion is optimized on a cost that is an exponent of the quadratic form of the state, input, and disturbance terms. We lay out a model-based and model-free algorithm for RL-based stochastic PO. For the model-based algorithm, we establish rigorous convergence guarantees. For the sampling-based algorithm, over trajectory arcs that emanate from the phase space, we find that the Hamilton-Jacobi Bellman equation parameterizes trajectory costs -- resulting in a discrete-time (input and state-based) sampling scheme accompanied by unknown nonlinear dynamics…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic processes and financial applications · Risk and Portfolio Optimization · Reinforcement Learning in Robotics
