Bayes-Adaptive Deep Model-Based Policy Optimisation

Tai Hoang; Ngo Anh Vien

arXiv:2010.15948·cs.RO·January 6, 2021

Bayes-Adaptive Deep Model-Based Policy Optimisation

Tai Hoang, Ngo Anh Vien

PDF

Open Access 1 Repo

TL;DR

This paper presents RoMBRL, a Bayesian deep model-based reinforcement learning method that effectively captures model uncertainty for more sample-efficient policy optimization, outperforming existing methods on control benchmarks.

Contribution

Introduction of RoMBRL, a Bayesian deep RL approach using belief distributions and history-based policies, enabling better uncertainty handling and sample efficiency.

Findings

01

RoMBRL outperforms existing methods on control benchmarks.

02

The method achieves higher sample efficiency and task performance.

03

Uncertainty propagation improves policy optimization.

Abstract

We introduce a Bayesian (deep) model-based reinforcement learning method (RoMBRL) that can capture model uncertainty to achieve sample-efficient policy optimisation. We propose to formulate the model-based policy optimisation problem as a Bayes-adaptive Markov decision process (BAMDP). RoMBRL maintains model uncertainty via belief distributions through a deep Bayesian neural network whose samples are generated via stochastic gradient Hamiltonian Monte Carlo. Uncertainty is propagated through simulations controlled by sampled models and history-based policies. As beliefs are encoded in visited histories, we propose a history-based policy network that can be end-to-end trained to generalise across history space and will be trained using recurrent Trust-Region Policy Optimisation. We show that RoMBRL outperforms existing approaches on many challenging control benchmark tasks in terms of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

thobotics/RoMBRL
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Simulation Techniques and Applications