Go-Explore: a New Approach for Hard-Exploration Problems

Adrien Ecoffet; Joost Huizinga; Joel Lehman; Kenneth O. Stanley; Jeff; Clune

arXiv:1901.10995·cs.LG·March 2, 2021·226 cites

Go-Explore: a New Approach for Hard-Exploration Problems

Adrien Ecoffet, Joost Huizinga, Joel Lehman, Kenneth O. Stanley, Jeff, Clune

PDF

Open Access 3 Repos 1 Models 1 Video

TL;DR

Go-Explore is a reinforcement learning algorithm that significantly improves exploration in sparse-reward environments by remembering states, returning to promising states, and exploring from them, achieving superhuman performance on Atari games.

Contribution

The paper introduces Go-Explore, a novel RL algorithm that outperforms existing methods on hard-exploration tasks by leveraging state recall and robust exploration strategies.

Findings

01

Achieved nearly 4x previous state-of-the-art on Montezuma's Revenge

02

Surpassed human and superhuman scores on Montezuma's Revenge

03

First to score above zero on Pitfall with RL algorithms

Abstract

A grand challenge in reinforcement learning is intelligent exploration, especially when rewards are sparse or deceptive. Two Atari games serve as benchmarks for such hard-exploration domains: Montezuma's Revenge and Pitfall. On both games, current RL algorithms perform poorly, even those with intrinsic motivation, which is the dominant method to improve performance on hard-exploration domains. To address this shortfall, we introduce a new algorithm called Go-Explore. It exploits the following principles: (1) remember previously visited states, (2) first return to a promising state (without exploration), then explore from it, and (3) solve simulated environments through any available means (including by introducing determinism), then robustify via imitation learning. The combined effect of these principles is a dramatic performance improvement on hard-exploration problems. On Montezuma's…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

🤗
Satori-reasoning/Satori-7B-Round2
model· 14 dl· ♡ 11
14 dl♡ 11

Videos

Go-Explore: a New Approach for Hard-Exploration Problems· youtube

Taxonomy

TopicsReinforcement Learning in Robotics · Artificial Intelligence in Games · Multimodal Machine Learning Applications

MethodsGo-Explore