Value of Information-Enhanced Exploration in Bootstrapped DQN

Stergios Plataniotis; Charilaos Akasiadis; Georgios Chalkiadakis

arXiv:2511.02969·cs.LG·November 24, 2025

Value of Information-Enhanced Exploration in Bootstrapped DQN

Stergios Plataniotis, Charilaos Akasiadis, Georgios Chalkiadakis

PDF

Open Access

TL;DR

This paper enhances deep exploration in reinforcement learning by integrating value of information into Bootstrapped DQN, leading to improved performance in complex, sparse-reward environments without extra hyperparameters.

Contribution

The paper introduces two novel algorithms that incorporate value of information estimates into Bootstrapped DQN to improve exploration efficiency.

Findings

01

Enhanced performance in Atari games with sparse rewards.

02

Better utilization of uncertainty without additional hyperparameters.

03

Improved exploration compared to traditional methods.

Abstract

Efficient exploration in deep reinforcement learning remains a fundamental challenge, especially in environments characterized by high-dimensional states and sparse rewards. Traditional exploration strategies that rely on random local policy noise, such as $ϵ$ -greedy and Boltzmann exploration methods, often struggle to efficiently balance exploration and exploitation. In this paper, we integrate the notion of (expected) value of information (EVOI) within the well-known Bootstrapped DQN algorithmic framework, to enhance the algorithm's deep exploration ability. Specifically, we develop two novel algorithms that incorporate the expected gain from learning the value of information into Bootstrapped DQN. Our methods use value of information estimates to measure the discrepancies of opinions among distinct network heads, and drive exploration towards areas with the most potential. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Advanced Bandit Algorithms Research