Dota 2 with Large Scale Deep Reinforcement Learning
OpenAI: Christopher Berner, Greg Brockman, Brooke Chan, Vicki Cheung,, Przemys{\l}aw D\k{e}biak, Christy Dennison, David Farhi, Quirin Fischer,, Shariq Hashme, Chris Hesse, Rafal J\'ozefowicz, Scott Gray, Catherine Olsson,, Jakub Pachocki, Michael Petrov, Henrique P. d.O. Pinto

TL;DR
This paper describes how OpenAI developed a large-scale deep reinforcement learning system, OpenAI Five, which mastered Dota 2 through extensive distributed training, achieving superhuman performance by defeating world champions.
Contribution
The paper introduces a scalable distributed training system for reinforcement learning and demonstrates its effectiveness in mastering complex, imperfect-information games like Dota 2.
Findings
OpenAI Five defeated the Dota 2 world champion Team OG.
The system trained for 10 months using 2 million frames per batch.
Self-play reinforcement learning achieved superhuman performance.
Abstract
On April 13th, 2019, OpenAI Five became the first AI system to defeat the world champions at an esports game. The game of Dota 2 presents novel challenges for AI systems such as long time horizons, imperfect information, and complex, continuous state-action spaces, all challenges which will become increasingly central to more capable AI systems. OpenAI Five leveraged existing reinforcement learning techniques, scaled to learn from batches of approximately 2 million frames every 2 seconds. We developed a distributed training system and tools for continual training which allowed us to train OpenAI Five for 10 months. By defeating the Dota 2 world champion (Team OG), OpenAI Five demonstrates that self-play reinforcement learning can achieve superhuman performance on a difficult task.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
OpenAI Performs Surgery On A Neural Network to Play DOTA 2· youtube
Taxonomy
TopicsReinforcement Learning in Robotics · Artificial Intelligence in Games · Advanced Bandit Algorithms Research
