BYOL-Explore: Exploration by Bootstrapped Prediction

Zhaohan Daniel Guo; Shantanu Thakoor; Miruna P\^islar; Bernardo Avila; Pires; Florent Altch\'e; Corentin Tallec; Alaa Saade; Daniele Calandriello,; Jean-Bastien Grill; Yunhao Tang; Michal Valko; R\'emi Munos; Mohammad; Gheshlaghi Azar; Bilal Piot

arXiv:2206.08332·cs.LG·June 17, 2022·5 cites

BYOL-Explore: Exploration by Bootstrapped Prediction

Zhaohan Daniel Guo, Shantanu Thakoor, Miruna P\^islar, Bernardo Avila, Pires, Florent Altch\'e, Corentin Tallec, Alaa Saade, Daniele Calandriello,, Jean-Bastien Grill, Yunhao Tang, Michal Valko, R\'emi Munos, Mohammad, Gheshlaghi Azar, Bilal Piot

PDF

Open Access 1 Datasets 1 Video

TL;DR

BYOL-Explore introduces a unified, curiosity-driven exploration method that learns representations and policies simultaneously, excelling in complex environments and outperforming prior approaches without auxiliary objectives.

Contribution

It presents a simple, general approach for exploration that combines representation learning, dynamics, and policy optimization in a single prediction loss.

Findings

01

Successfully solves challenging partially-observable benchmarks

02

Achieves superhuman performance on difficult Atari exploration games

03

Outperforms prior methods that rely on human demonstrations

Abstract

We present BYOL-Explore, a conceptually simple yet general approach for curiosity-driven exploration in visually-complex environments. BYOL-Explore learns a world representation, the world dynamics, and an exploration policy all-together by optimizing a single prediction loss in the latent space with no additional auxiliary objective. We show that BYOL-Explore is effective in DM-HARD-8, a challenging partially-observable continuous-action hard-exploration benchmark with visually-rich 3-D environments. On this benchmark, we solve the majority of the tasks purely through augmenting the extrinsic reward with BYOL-Explore s intrinsic reward, whereas prior work could only get off the ground with human demonstrations. As further evidence of the generality of BYOL-Explore, we show that it achieves superhuman performance on the ten hardest exploration games in Atari while having a much simpler…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

misovalko/my-research-papers
dataset· 21 dl
21 dl

Videos

BYOL-Explore: Exploration by Bootstrapped Prediction· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics · Multimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques