BADDr: Bayes-Adaptive Deep Dropout RL for POMDPs

Sammie Katt; Hai Nguyen; Frans A. Oliehoek; Christopher Amato

arXiv:2202.08884·cs.LG·February 21, 2022·1 cites

BADDr: Bayes-Adaptive Deep Dropout RL for POMDPs

Sammie Katt, Hai Nguyen, Frans A. Oliehoek, Christopher Amato

PDF

Open Access

TL;DR

This paper introduces BADDr, a scalable Bayesian RL method for POMDPs using dropout networks, unifying previous models and demonstrating competitive performance on small and large domains.

Contribution

It presents a representation-agnostic Bayesian RL framework and a novel dropout-based approach that improves scalability in partially observable environments.

Findings

01

Competitive with state-of-the-art BRL on small domains

02

Able to solve larger POMDPs effectively

03

Belief inference is more scalable with dropout networks

Abstract

While reinforcement learning (RL) has made great advances in scalability, exploration and partial observability are still active research topics. In contrast, Bayesian RL (BRL) provides a principled answer to both state estimation and the exploration-exploitation trade-off, but struggles to scale. To tackle this challenge, BRL frameworks with various prior assumptions have been proposed, with varied success. This work presents a representation-agnostic formulation of BRL under partially observability, unifying the previous models under one theoretical umbrella. To demonstrate its practical significance we also propose a novel derivation, Bayes-Adaptive Deep Dropout rl (BADDr), based on dropout networks. Under this parameterization, in contrast to previous work, the belief over the state and dynamics is a more scalable inference problem. We choose actions through Monte-Carlo tree search…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Stream Mining Techniques · Adversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications

MethodsMonte-Carlo Tree Search · Dropout