On Efficient Bayesian Exploration in Model-Based Reinforcement Learning

Alberto Caron; Chris Hicks; Vasilios Mavroudis

arXiv:2507.02639·cs.LG·July 4, 2025

On Efficient Bayesian Exploration in Model-Based Reinforcement Learning

Alberto Caron, Chris Hicks, Vasilios Mavroudis

PDF

TL;DR

This paper introduces a theoretically grounded, information-theoretic approach to data-efficient exploration in model-based reinforcement learning, utilizing epistemic uncertainty bonuses and practical approximations to improve sample efficiency.

Contribution

It provides formal guarantees for epistemic uncertainty-based exploration bonuses and proposes a new framework, PTS-BE, combining planning with information gain for improved exploration.

Findings

01

PTS-BE outperforms baselines in sparse reward environments

02

Theoretical guarantees for IG-based exploration bonuses are established

03

Practical approximations enable scalable implementation

Abstract

In this work, we address the challenge of data-efficient exploration in reinforcement learning by examining existing principled, information-theoretic approaches to intrinsic motivation. Specifically, we focus on a class of exploration bonuses that targets epistemic uncertainty rather than the aleatoric noise inherent in the environment. We prove that these bonuses naturally signal epistemic information gains and converge to zero once the agent becomes sufficiently certain about the environment's dynamics and rewards, thereby aligning exploration with genuine knowledge gaps. Our analysis provides formal guarantees for IG-based approaches, which previously lacked theoretical grounding. To enable practical use, we also discuss tractable approximations via sparse variational Gaussian Processes, Deep Kernels and Deep Ensemble models. We then outline a general framework - Predictive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.