First Go, then Post-Explore: the Benefits of Post-Exploration in   Intrinsic Motivation

Zhao Yang; Thomas M. Moerland; Mike Preuss; Aske Plaat

arXiv:2212.03251·cs.LG·January 9, 2023

First Go, then Post-Explore: the Benefits of Post-Exploration in Intrinsic Motivation

Zhao Yang, Thomas M. Moerland, Mike Preuss, Aske Plaat

PDF

Open Access

TL;DR

This paper demonstrates that incorporating post-exploration into intrinsic motivation frameworks significantly improves the diversity of explored states and overall performance in reinforcement learning tasks, across various environments and settings.

Contribution

It provides the first ablation study isolating the effects of post-exploration within the IMGEP framework, showing its effectiveness and ease of integration.

Findings

01

Post-exploration increases state diversity in RL agents.

02

Post-exploration boosts performance in both discrete and continuous tasks.

03

The method is effective, method-agnostic, and easy to implement.

Abstract

Go-Explore achieved breakthrough performance on challenging reinforcement learning (RL) tasks with sparse rewards. The key insight of Go-Explore was that successful exploration requires an agent to first return to an interesting state ('Go'), and only then explore into unknown terrain ('Explore'). We refer to such exploration after a goal is reached as 'post-exploration'. In this paper, we present a clear ablation study of post-exploration in a general intrinsically motivated goal exploration process (IMGEP) framework, that the Go-Explore paper did not show. We study the isolated potential of post-exploration, by turning it on and off within the same algorithm under both tabular and deep RL settings on both discrete navigation and continuous control tasks. Experiments on a range of MiniGrid and Mujoco environments show that post-exploration indeed helps IMGEP agents reach more diverse…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Distributed Control Multi-Agent Systems · Multimodal Machine Learning Applications

MethodsIntrinsically Motivated Goal Exploration Processes · Go-Explore