Deep BSDE-ML Learning and Its Application to Model-Free Optimal Control
Yutian Wang, Yuan-Hua Ni

TL;DR
This paper introduces the Deep BSDE-ML method, a novel approach for solving linear FBSDEs in stochastic control, extending the Deep BSDE framework with a measurability loss to improve approximation of control policies.
Contribution
The paper proposes a modified Deep BSDE method with a new measurability loss, enabling better approximation of gradients in PDEs and facilitating model-free optimal control learning.
Findings
Measurability loss equals the expected mean squared error of the diffusion term.
Deep BSDE-ML extends the application to approximate PDE gradients.
Framework successfully learns robust feedback controllers with exploration noise.
Abstract
A modified Deep BSDE (backward differential equation) learning method with measurability loss, called Deep BSDE-ML method, is introduced in this paper to solve a kind of linear decoupled forward-backward stochastic differential equations (FBSDEs), which is encountered in the policy evaluation of learning the optimal feedback policies of a class of stochastic control problems. The measurability loss is characterized via the measurability of BSDE's state at the forward initial time, which differs from that related to terminal state of the known Deep BSDE method. Though the minima of the two loss functions are shown to be equal, this measurability loss is proved to be equal to the expected mean squared error between the true diffusion term of BSDE and its approximation. This crucial observation extends the application of the Deep BSDE method -- approximating the gradients of the solution…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModel Reduction and Neural Networks · Energy, Environment, and Transportation Policies · Gaussian Processes and Bayesian Inference
