Exploring through Random Curiosity with General Value Functions
Aditya Ramesh, Louis Kirsch, Sjoerd van Steenkiste, J\"urgen, Schmidhuber

TL;DR
This paper introduces RC-GVF, a novel intrinsic reward method for reinforcement learning that predicts general value functions to enhance exploration, especially in partially observable environments, outperforming previous approaches.
Contribution
The paper proposes RC-GVF, connecting curiosity and value prediction, to improve exploration in challenging partially observable RL environments, demonstrating superior performance.
Findings
RC-GVF improves exploration in diabolical lock problems.
RC-GVF outperforms previous methods in MiniGrid environments.
Panoramic observations further enhance RC-GVF's performance.
Abstract
Efficient exploration in reinforcement learning is a challenging problem commonly addressed through intrinsic rewards. Recent prominent approaches are based on state novelty or variants of artificial curiosity. However, directly applying them to partially observable environments can be ineffective and lead to premature dissipation of intrinsic rewards. Here we propose random curiosity with general value functions (RC-GVF), a novel intrinsic reward function that draws upon connections between these distinct approaches. Instead of using only the current observation's novelty or a curiosity bonus for failing to predict precise environment dynamics, RC-GVF derives intrinsic rewards through predicting temporally extended general value functions. We demonstrate that this improves exploration in a hard-exploration diabolical lock problem. Furthermore, RC-GVF significantly outperforms previous…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsReinforcement Learning in Robotics · Evolutionary Algorithms and Applications · Neural Networks and Reservoir Computing
