Dealing with Sparse Rewards in Reinforcement Learning
Joshua Hare

TL;DR
This paper explores and compares various reinforcement learning methods designed to handle environments with sparse rewards, introducing a novel approach that combines curiosity-driven exploration and unsupervised auxiliary tasks.
Contribution
It presents a new reinforcement learning solution that merges two existing state-of-the-art methods to better address sparse reward challenges.
Findings
The combined approach improves learning efficiency in sparse reward environments.
Different methods show varying effectiveness depending on environment difficulty.
The novel method outperforms individual existing solutions in tested video game environments.
Abstract
Successfully navigating a complex environment to obtain a desired outcome is a difficult task, that up to recently was believed to be capable only by humans. This perception has been broken down over time, especially with the introduction of deep reinforcement learning, which has greatly increased the difficulty of tasks that can be automated. However, for traditional reinforcement learning agents this requires an environment to be able to provide frequent extrinsic rewards, which are not known or accessible for many real-world environments. This project aims to explore and contrast existing reinforcement learning solutions that circumnavigate the difficulties of an environment that provide sparse rewards. Different reinforcement solutions will be implemented over a several video game environments with varying difficulty and varying frequency of rewards, as to properly investigate the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Evolutionary Algorithms and Applications · Adaptive Dynamic Programming Control
