A Policy Gradient Framework for Stochastic Optimal Control Problems with Global Convergence Guarantee
Mo Zhou, Jianfeng Lu

TL;DR
This paper introduces a policy gradient framework for stochastic optimal control in continuous time, proving global convergence and convergence rates of the gradient flow under certain regularity conditions.
Contribution
It provides the first rigorous analysis of the global convergence of policy gradient methods in continuous-time stochastic control, introducing the concept of local optimal control functions.
Findings
Proves global convergence of the gradient flow in continuous-time stochastic control.
Establishes convergence rates under regularity assumptions.
Introduces the notion of local optimal control functions for analysis.
Abstract
We consider policy gradient methods for stochastic optimal control problem in continuous time. In particular, we analyze the gradient flow for the control, viewed as a continuous time limit of the policy gradient method. We prove the global convergence of the gradient flow and establish a convergence rate under some regularity assumptions. The main novelty in the analysis is the notion of local optimal control function, which is introduced to characterize the local optimality of the iterate.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic processes and financial applications · Risk and Portfolio Optimization · Optimization and Variational Analysis
