Global Convergence of Direct Policy Search for State-Feedback   $\mathcal{H}_\infty$ Robust Control: A Revisit of Nonsmooth Synthesis with   Goldstein Subdifferential

Xingang Guo; Bin Hu

arXiv:2210.11577·math.OC·October 24, 2022

Global Convergence of Direct Policy Search for State-Feedback $\mathcal{H}_\infty$ Robust Control: A Revisit of Nonsmooth Synthesis with Goldstein Subdifferential

Xingang Guo, Bin Hu

PDF

Open Access 1 Video

TL;DR

This paper proves that direct policy search methods, specifically Goldstein's subgradient method, can globally solve nonsmooth, nonconvex $ ext{H}_$ robust control problems, establishing convergence guarantees and connecting optimization theory with control design.

Contribution

It demonstrates that all Clarke stationary points are global minima in nonsmooth $ ext{H}_$ control, and proves the global convergence of Goldstein's method for this class of problems.

Findings

01

All Clarke stationary points are global minima.

02

Goldstein's subgradient method converges globally to the optimal solution.

03

The $ ext{H}_$ control problem's sublevel sets are compact.

Abstract

Direct policy search has been widely applied in modern reinforcement learning and continuous control. However, the theoretical properties of direct policy search on nonsmooth robust control synthesis have not been fully understood. The optimal $H_{\infty}$ control framework aims at designing a policy to minimize the closed-loop $H_{\infty}$ norm, and is arguably the most fundamental robust control paradigm. In this work, we show that direct policy search is guaranteed to find the global solution of the robust $H_{\infty}$ state-feedback control design problem. Notice that policy search for optimal $H_{\infty}$ control leads to a constrained nonconvex nonsmooth optimization problem, where the nonconvex feasible set consists of all the policies stabilizing the closed-loop dynamics. We show that for this nonsmooth optimization problem, all Clarke stationary…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Global Convergence of Direct Policy Search for State-Feedback $\mathcal{H}_\infty$ Robust Control: A Revisit of Nonsmooth Synthesis with Goldstein Subdifferential· slideslive

Taxonomy

TopicsAdaptive Dynamic Programming Control · Reinforcement Learning in Robotics