Explore-Go: Leveraging Exploration for Generalisation in Deep Reinforcement Learning
Max Weltevrede, Felix Kaubek, Matthijs T.J. Spaan, Wendelin B\"ohmer

TL;DR
Explore-Go introduces a novel exploration-based method to enhance generalisation in deep reinforcement learning by increasing the diversity of training states, demonstrating improved performance on benchmarks.
Contribution
The paper proposes Explore-Go, a new approach that expands the training state distribution to improve generalisation in reinforcement learning agents.
Findings
Explore-Go improves generalisation in a benchmark environment.
The method enhances performance on the Procgen benchmark.
Increased exploration benefits states not explicitly encountered during training.
Abstract
One of the remaining challenges in reinforcement learning is to develop agents that can generalise to novel scenarios they might encounter once deployed. This challenge is often framed in a multi-task setting where agents train on a fixed set of tasks and have to generalise to new tasks. Recent work has shown that in this setting increased exploration during training can be leveraged to increase the generalisation performance of the agent. This makes sense when the states encountered during testing can actually be explored during training. In this paper, we provide intuition why exploration can also benefit generalisation to states that cannot be explicitly encountered during training. Additionally, we propose a novel method Explore-Go that exploits this intuition by increasing the number of states on which the agent trains. Explore-Go effectively increases the starting state…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics
MethodsSparse Evolutionary Training
