Mastering the Unsupervised Reinforcement Learning Benchmark from Pixels

Sai Rajeswar; Pietro Mazzaglia; Tim Verbelen; Alexandre Pich\'e; Bart; Dhoedt; Aaron Courville; Alexandre Lacoste

arXiv:2209.12016·cs.AI·May 26, 2023·5 cites

Mastering the Unsupervised Reinforcement Learning Benchmark from Pixels

Sai Rajeswar, Pietro Mazzaglia, Tim Verbelen, Alexandre Pich\'e, Bart, Dhoedt, Aaron Courville, Alexandre Lacoste

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel unsupervised model-based reinforcement learning approach that pre-trains agents and adapts efficiently to downstream tasks, significantly outperforming previous methods on the Unsupervised RL Benchmark from pixels.

Contribution

It proposes a new method combining unsupervised model-based RL with a hybrid planner and task-aware fine-tuning, advancing the state-of-the-art in visual control benchmarks.

Findings

01

Achieved 93.59% normalized performance on URLB, surpassing previous baselines.

02

Validated robustness and generalization through large-scale empirical studies.

03

Demonstrated effective transfer to real-world RL benchmarks.

Abstract

Controlling artificial agents from visual sensory data is an arduous task. Reinforcement learning (RL) algorithms can succeed but require large amounts of interactions between the agent and the environment. To alleviate the issue, unsupervised RL proposes to employ self-supervised interaction and learning, for adapting faster to future tasks. Yet, as shown in the Unsupervised RL Benchmark (URLB; Laskin et al. 2021), whether current unsupervised strategies can improve generalization capabilities is still unclear, especially in visual control settings. In this work, we study the URLB and propose a new method to solve it, using unsupervised model-based RL, for pre-training the agent, and a task-aware fine-tuning strategy combined with a new proposed hybrid planner, Dyna-MPC, to adapt the agent for downstream tasks. On URLB, our method obtains 93.59% overall normalized performance,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mazpie/mastering-urlb
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Human Pose and Action Recognition