# Learning Deep Stochastic Optimal Control Policies using Forward-Backward   SDEs

**Authors:** Marcus Pereira, Ziyi Wang, Ioannis Exarchos, and Evangelos A., Theodorou

arXiv: 1902.03986 · 2021-07-12

## TL;DR

This paper introduces a scalable deep learning framework for stochastic optimal control based on forward-backward SDEs, applicable to complex robotics systems under uncertainty.

## Contribution

It presents a novel neural network architecture leveraging forward-backward SDEs for decision-making in stochastic control, extending applicability to general robotics problems.

## Key findings

- Effective in three non-linear simulated systems
- Handles control constraints successfully
- Scalable to complex stochastic systems

## Abstract

In this paper we propose a new methodology for decision-making under uncertainty using recent advancements in the areas of nonlinear stochastic optimal control theory, applied mathematics, and machine learning. Grounded on the fundamental relation between certain nonlinear partial differential equations and forward-backward stochastic differential equations, we develop a control framework that is scalable and applicable to general classes of stochastic systems and decision-making problem formulations in robotics and autonomy. The proposed deep neural network architectures for stochastic control consist of recurrent and fully connected layers. The performance and scalability of the aforementioned algorithm are investigated in three non-linear systems in simulation with and without control constraints. We conclude with a discussion on future directions and their implications to robotics.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1902.03986/full.md

## Figures

10 figures with captions in the complete paper: https://tomesphere.com/paper/1902.03986/full.md

## References

32 references — full list in the complete paper: https://tomesphere.com/paper/1902.03986/full.md

---
Source: https://tomesphere.com/paper/1902.03986