Mastering Atari Games with Limited Data

Weirui Ye; Shaohuai Liu; Thanard Kurutach; Pieter Abbeel; Yang Gao

arXiv:2111.00210·cs.LG·December 14, 2021·40 cites

Mastering Atari Games with Limited Data

Weirui Ye, Shaohuai Liu, Thanard Kurutach, Pieter Abbeel, Yang Gao

PDF

Open Access 3 Repos 1 Models 1 Datasets 2 Videos

TL;DR

EfficientZero is a novel model-based visual reinforcement learning algorithm that achieves super-human performance on Atari games using only 100k environment steps, significantly reducing data requirements compared to previous methods.

Contribution

The paper introduces EfficientZero, a sample-efficient, model-based RL algorithm built on MuZero, achieving state-of-the-art performance with minimal data on Atari and DMControl benchmarks.

Findings

01

Achieves 194.3% mean human performance on Atari 100k benchmark.

02

Outperforms SAC on some DMControl 100k tasks.

03

Consumes 500 times less data than DQN to reach comparable performance.

Abstract

Reinforcement learning has achieved great success in many applications. However, sample efficiency remains a key challenge, with prominent methods requiring millions (or even billions) of environment steps to train. Recently, there has been significant progress in sample efficient image-based RL algorithms; however, consistent human-level performance on the Atari game benchmark remains an elusive goal. We propose a sample efficient model-based visual RL algorithm built on MuZero, which we name EfficientZero. Our method achieves 194.3% mean human performance and 109.0% median performance on the Atari 100k benchmark with only two hours of real-time game experience and outperforms the state SAC in some tasks on the DMControl 100k benchmark. This is the first time an algorithm achieves super-human performance on Atari games with such little data. EfficientZero's performance is also close to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

🤗
stevietrouble/EfficientZeroRemastered
model· ♡ 1
♡ 1

Datasets

OpenDILabCommunity/Pong-v4-expert-MCTS
dataset· 124 dl
124 dl

Videos

EfficientZero: Mastering Atari Games with Limited Data (Machine Learning Research Paper Explained)· youtube

Mastering Atari Games with Limited Data· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics · Human Pose and Action Recognition · Multimodal Machine Learning Applications

Methods*Communicated@Fast*How Do I Communicate to Expedia? · Batch Normalization · Residual Connection · Residual Block · Average Pooling · Global Average Pooling · Dilated Convolution · 1x1 Convolution · Switchable Atrous Convolution · Convolution