A Bayesian Sampling Approach to Exploration in Reinforcement Learning

John Asmuth; Lihong Li; Michael L. Littman; Ali Nouri; David Wingate

arXiv:1205.2664·cs.LG·May 14, 2012·154 cites

A Bayesian Sampling Approach to Exploration in Reinforcement Learning

John Asmuth, Lihong Li, Michael L. Littman, Ali Nouri, David Wingate

PDF

Open Access

TL;DR

This paper introduces BOSS, a Bayesian sampling method for reinforcement learning that efficiently balances exploration and exploitation by sampling multiple models from the posterior, achieving near-optimal rewards with low sample complexity.

Contribution

The paper proposes a modular Bayesian approach with a resampling rule, extending prior work and demonstrating improved performance and flexibility in reinforcement learning tasks.

Findings

01

Achieves near-optimal reward with high probability

02

Low sample complexity relative to posterior convergence

03

Performs favorably compared to state-of-the-art methods

Abstract

We present a modular approach to reinforcement learning that uses a Bayesian representation of the uncertainty over models. The approach, BOSS (Best of Sampled Set), drives exploration by sampling multiple models from the posterior and selecting actions optimistically. It extends previous work by providing a rule for deciding when to resample and how to combine the models. We show that our algorithm achieves nearoptimal reward with high probability with a sample complexity that is low relative to the speed at which the posterior distribution converges during learning. We demonstrate that BOSS performs quite favorably compared to state-of-the-art reinforcement-learning approaches and illustrate its flexibility by pairing it with a non-parametric model that generalizes across states.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Evolutionary Algorithms and Applications · Data Stream Mining Techniques