Learning and Policy Search in Stochastic Dynamical Systems with Bayesian   Neural Networks

Stefan Depeweg; Jos\'e Miguel Hern\'andez-Lobato; Finale Doshi-Velez,; Steffen Udluft

arXiv:1605.07127·stat.ML·March 9, 2017

Learning and Policy Search in Stochastic Dynamical Systems with Bayesian Neural Networks

Stefan Depeweg, Jos\'e Miguel Hern\'andez-Lobato, Finale Doshi-Velez,, Steffen Udluft

PDF

2 Repos 1 Video

TL;DR

This paper introduces a model-based reinforcement learning algorithm that uses Bayesian neural networks trained with alpha-divergences to effectively model complex stochastic dynamics, enabling successful policy learning in challenging scenarios.

Contribution

The paper proposes a novel combination of Bayesian neural networks trained with alpha-divergences and stochastic optimization for policy search in stochastic dynamical systems.

Findings

01

Successfully solves a challenging benchmark where other methods fail.

02

Achieves promising results in controlling a real-world gas turbine.

03

Captures complex statistical patterns like multi-modality and heteroskedasticity.

Abstract

We present an algorithm for model-based reinforcement learning that combines Bayesian neural networks (BNNs) with random roll-outs and stochastic optimization for policy learning. The BNNs are trained by minimizing $α$ -divergences, allowing us to capture complicated statistical patterns in the transition dynamics, e.g. multi-modality and heteroskedasticity, which are usually missed by other common modeling approaches. We illustrate the performance of our method by solving a challenging benchmark where model-based approaches usually fail and by obtaining promising results in a real-world scenario for controlling a gas turbine.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

Learning and Policy Search in Stochastic Dynamical Systems with Bayesian Neural Networks· youtube