Value of Information-Enhanced Exploration in Bootstrapped DQN
Stergios Plataniotis, Charilaos Akasiadis, Georgios Chalkiadakis

TL;DR
This paper enhances deep exploration in reinforcement learning by integrating value of information into Bootstrapped DQN, leading to improved performance in complex, sparse-reward environments without extra hyperparameters.
Contribution
The paper introduces two novel algorithms that incorporate value of information estimates into Bootstrapped DQN to improve exploration efficiency.
Findings
Enhanced performance in Atari games with sparse rewards.
Better utilization of uncertainty without additional hyperparameters.
Improved exploration compared to traditional methods.
Abstract
Efficient exploration in deep reinforcement learning remains a fundamental challenge, especially in environments characterized by high-dimensional states and sparse rewards. Traditional exploration strategies that rely on random local policy noise, such as -greedy and Boltzmann exploration methods, often struggle to efficiently balance exploration and exploitation. In this paper, we integrate the notion of (expected) value of information (EVOI) within the well-known Bootstrapped DQN algorithmic framework, to enhance the algorithm's deep exploration ability. Specifically, we develop two novel algorithms that incorporate the expected gain from learning the value of information into Bootstrapped DQN. Our methods use value of information estimates to measure the discrepancies of opinions among distinct network heads, and drive exploration towards areas with the most potential. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Advanced Bandit Algorithms Research
