StarCraft II: A New Challenge for Reinforcement Learning

Oriol Vinyals; Timo Ewalds; Sergey Bartunov; Petko Georgiev; Alexander; Sasha Vezhnevets; Michelle Yeo; Alireza Makhzani; Heinrich K\"uttler; John; Agapiou; Julian Schrittwieser; John Quan; Stephen Gaffney; Stig Petersen,; Karen Simonyan; Tom Schaul; Hado van Hasselt; David Silver; Timothy; Lillicrap; Kevin Calderone; Paul Keet; Anthony Brunasso; David Lawrence,; Anders Ekermo; Jacob Repp; Rodney Tsing

arXiv:1708.04782·cs.LG·August 17, 2017·684 cites

StarCraft II: A New Challenge for Reinforcement Learning

Oriol Vinyals, Timo Ewalds, Sergey Bartunov, Petko Georgiev, Alexander, Sasha Vezhnevets, Michelle Yeo, Alireza Makhzani, Heinrich K\"uttler, John, Agapiou, Julian Schrittwieser, John Quan, Stephen Gaffney, Stig Petersen,, Karen Simonyan, Tom Schaul, Hado van Hasselt

PDF

Open Access 5 Repos 1 Video

TL;DR

This paper introduces the StarCraft II Learning Environment (SC2LE), a complex reinforcement learning platform with multi-agent, large action and state spaces, and delayed rewards, challenging current AI algorithms and providing benchmarks for progress.

Contribution

It presents a new RL environment based on StarCraft II, including detailed specifications, datasets, mini-games, and baseline results, to advance research in complex multi-agent reinforcement learning.

Findings

01

Baseline neural networks predict game outcomes and actions from human replay data.

02

Deep RL agents perform well on mini-games but struggle with the main game.

03

SC2LE provides a challenging testbed for future RL research.

Abstract

This paper introduces SC2LE (StarCraft II Learning Environment), a reinforcement learning environment based on the StarCraft II game. This domain poses a new grand challenge for reinforcement learning, representing a more difficult class of problems than considered in most prior work. It is a multi-agent problem with multiple players interacting; there is imperfect information due to a partially observed map; it has a large action space involving the selection and control of hundreds of units; it has a large state space that must be observed solely from raw input feature planes; and it has delayed credit assignment requiring long-term strategies over thousands of steps. We describe the observation, action, and reward specification for the StarCraft II domain and provide an open source Python-based interface for communicating with the game engine. In addition to the main game maps, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

DeepMind Publishes StarCraft II Learning Environment | Two Minute Papers #182· youtube

Taxonomy

TopicsDigital Games and Media · Reinforcement Learning in Robotics · Artificial Intelligence in Games