Monte Carlo Bayesian Reinforcement Learning
Yi Wang (NUS), Kok Sung Won (NUS), David Hsu (NUS), Wee Sun Lee (NUS)

TL;DR
Monte Carlo Bayesian Reinforcement Learning (MC-BRL) introduces a sampling-based approach that constructs a discrete POMDP to effectively handle model uncertainty in both fully and partially observable environments, with proven performance guarantees.
Contribution
MC-BRL provides a simple, general method for Bayesian reinforcement learning that does not require conjugate priors and can be solved efficiently with existing algorithms.
Findings
Discrete POMDP approximates BRL well with guaranteed performance.
Handles both fully and partially observable worlds effectively.
Does not require conjugate distributions for belief representation.
Abstract
Bayesian reinforcement learning (BRL) encodes prior knowledge of the world in a model and represents uncertainty in model parameters by maintaining a probability distribution over them. This paper presents Monte Carlo BRL (MC-BRL), a simple and general approach to BRL. MC-BRL samples a priori a finite set of hypotheses for the model parameter values and forms a discrete partially observable Markov decision process (POMDP) whose state space is a cross product of the state space for the reinforcement learning task and the sampled model parameter space. The POMDP does not require conjugate distributions for belief representation, as earlier works do, and can be solved relatively easily with point-based approximation algorithms. MC-BRL naturally handles both fully and partially observable worlds. Theoretical and experimental results show that the discrete POMDP approximates the underlying…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Evolutionary Algorithms and Applications · Data Stream Mining Techniques
