Monte Carlo Bayesian Reinforcement Learning

Yi Wang (NUS); Kok Sung Won (NUS); David Hsu (NUS); Wee Sun Lee (NUS)

arXiv:1206.6449·cs.LG·July 3, 2012·ICML·22 cites

Monte Carlo Bayesian Reinforcement Learning

Yi Wang (NUS), Kok Sung Won (NUS), David Hsu (NUS), Wee Sun Lee (NUS)

PDF

Open Access

TL;DR

Monte Carlo Bayesian Reinforcement Learning (MC-BRL) introduces a sampling-based approach that constructs a discrete POMDP to effectively handle model uncertainty in both fully and partially observable environments, with proven performance guarantees.

Contribution

MC-BRL provides a simple, general method for Bayesian reinforcement learning that does not require conjugate priors and can be solved efficiently with existing algorithms.

Findings

01

Discrete POMDP approximates BRL well with guaranteed performance.

02

Handles both fully and partially observable worlds effectively.

03

Does not require conjugate distributions for belief representation.

Abstract

Bayesian reinforcement learning (BRL) encodes prior knowledge of the world in a model and represents uncertainty in model parameters by maintaining a probability distribution over them. This paper presents Monte Carlo BRL (MC-BRL), a simple and general approach to BRL. MC-BRL samples a priori a finite set of hypotheses for the model parameter values and forms a discrete partially observable Markov decision process (POMDP) whose state space is a cross product of the state space for the reinforcement learning task and the sampled model parameter space. The POMDP does not require conjugate distributions for belief representation, as earlier works do, and can be solved relatively easily with point-based approximation algorithms. MC-BRL naturally handles both fully and partially observable worlds. Theoretical and experimental results show that the discrete POMDP approximates the underlying…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Evolutionary Algorithms and Applications · Data Stream Mining Techniques