Deep Curiosity Search: Intra-Life Exploration Can Improve Performance on Challenging Deep Reinforcement Learning Problems
Christopher Stanton, Jeff Clune

TL;DR
Deep Curiosity Search (DeepCS) enhances reinforcement learning by promoting intra-life exploration, leading to improved performance on challenging sparse- and dense-reward games, surpassing or matching state-of-the-art methods.
Contribution
DeepCS introduces intra-life exploration rewards, a novel approach that improves performance across various challenging RL environments compared to prior across-training novelty methods.
Findings
DeepCS matches state-of-the-art on Montezuma's Revenge.
DeepCS doubles A2C performance on Seaquest.
DeepCS improves exploration on multiple hard games.
Abstract
Traditional exploration methods in RL require agents to perform random actions to find rewards. But these approaches struggle on sparse-reward domains like Montezuma's Revenge where the probability that any random action sequence leads to reward is extremely low. Recent algorithms have performed well on such tasks by encouraging agents to visit new states or perform new actions in relation to all prior training episodes (which we call across-training novelty). But such algorithms do not consider whether an agent exhibits intra-life novelty: doing something new within the current episode, regardless of whether those behaviors have been performed in previous episodes. We hypothesize that across-training novelty might discourage agents from revisiting initially non-rewarding states that could become important stepping stones later in training. We introduce Deep Curiosity Search (DeepCS),…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Artificial Intelligence in Games · Multimodal Machine Learning Applications
MethodsPrioritized Experience Replay · Ape-X · A2C
