TL;DR
This paper introduces a model-based reinforcement learning algorithm that uses Bayesian neural networks trained with alpha-divergences to effectively model complex stochastic dynamics, enabling successful policy learning in challenging scenarios.
Contribution
The paper proposes a novel combination of Bayesian neural networks trained with alpha-divergences and stochastic optimization for policy search in stochastic dynamical systems.
Findings
Successfully solves a challenging benchmark where other methods fail.
Achieves promising results in controlling a real-world gas turbine.
Captures complex statistical patterns like multi-modality and heteroskedasticity.
Abstract
We present an algorithm for model-based reinforcement learning that combines Bayesian neural networks (BNNs) with random roll-outs and stochastic optimization for policy learning. The BNNs are trained by minimizing -divergences, allowing us to capture complicated statistical patterns in the transition dynamics, e.g. multi-modality and heteroskedasticity, which are usually missed by other common modeling approaches. We illustrate the performance of our method by solving a challenging benchmark where model-based approaches usually fail and by obtaining promising results in a real-world scenario for controlling a gas turbine.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Learning and Policy Search in Stochastic Dynamical Systems with Bayesian Neural Networks· youtube
