Explore-Go: Leveraging Exploration for Generalisation in Deep   Reinforcement Learning

Max Weltevrede; Felix Kaubek; Matthijs T.J. Spaan; Wendelin B\"ohmer

arXiv:2406.08069·cs.LG·September 19, 2024

Explore-Go: Leveraging Exploration for Generalisation in Deep Reinforcement Learning

Max Weltevrede, Felix Kaubek, Matthijs T.J. Spaan, Wendelin B\"ohmer

PDF

Open Access

TL;DR

Explore-Go introduces a novel exploration-based method to enhance generalisation in deep reinforcement learning by increasing the diversity of training states, demonstrating improved performance on benchmarks.

Contribution

The paper proposes Explore-Go, a new approach that expands the training state distribution to improve generalisation in reinforcement learning agents.

Findings

01

Explore-Go improves generalisation in a benchmark environment.

02

The method enhances performance on the Procgen benchmark.

03

Increased exploration benefits states not explicitly encountered during training.

Abstract

One of the remaining challenges in reinforcement learning is to develop agents that can generalise to novel scenarios they might encounter once deployed. This challenge is often framed in a multi-task setting where agents train on a fixed set of tasks and have to generalise to new tasks. Recent work has shown that in this setting increased exploration during training can be leveraged to increase the generalisation performance of the agent. This makes sense when the states encountered during testing can actually be explored during training. In this paper, we provide intuition why exploration can also benefit generalisation to states that cannot be explicitly encountered during training. Additionally, we propose a novel method Explore-Go that exploits this intuition by increasing the number of states on which the agent trains. Explore-Go effectively increases the starting state…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics

MethodsSparse Evolutionary Training