AlphaStar Unplugged: Large-Scale Offline Reinforcement Learning
Micha\"el Mathieu, Sherjil Ozair, Srivatsan Srinivasan, Caglar, Gulcehre, Shangtong Zhang, Ray Jiang, Tom Le Paine, Richard Powell, Konrad, \.Zo{\l}na, Julian Schrittwieser, David Choi, Petko Georgiev, Daniel Toyama,, Aja Huang, Roman Ring, Igor Babuschkin, Timo Ewalds

TL;DR
This paper introduces AlphaStar Unplugged, a large-scale offline RL benchmark for StarCraft II, leveraging a massive human gameplay dataset to develop and evaluate offline RL agents in a complex, real-time strategy environment.
Contribution
It establishes a new offline RL benchmark for StarCraft II, providing datasets, tools, and evaluation protocols, and demonstrates improved agents using only offline data.
Findings
Achieved 90% win rate against previous behavior cloning agents.
Established a standardized API and evaluation protocol for offline StarCraft II RL.
Presented baseline agents including behavior cloning, offline actor-critic, and MuZero variants.
Abstract
StarCraft II is one of the most challenging simulated reinforcement learning environments; it is partially observable, stochastic, multi-agent, and mastering StarCraft II requires strategic planning over long time horizons with real-time low-level execution. It also has an active professional competitive scene. StarCraft II is uniquely suited for advancing offline RL algorithms, both because of its challenging nature and because Blizzard has released a massive dataset of millions of StarCraft II games played by human players. This paper leverages that and establishes a benchmark, called AlphaStar Unplugged, introducing unprecedented challenges for offline reinforcement learning. We define a dataset (a subset of Blizzard's release), tools standardizing an API for machine learning methods, and an evaluation protocol. We also present baseline agents, including behavior cloning, offline…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Games · Digital Games and Media · Sports Analytics and Performance
Methods[LivE@PeRson]How do I talk to a real person at Expedia? · *Communicated@Fast*How Do I Communicate to Expedia? · Attention Is All You Need · Label Smoothing · Linear Layer · Adam · Dropout · Absolute Position Encodings · Monte-Carlo Tree Search · Byte Pair Encoding
