DouZero: Mastering DouDizhu with Self-Play Deep Reinforcement Learning

Daochen Zha; Jingru Xie; Wenye Ma; Sheng Zhang; Xiangru Lian; Xia Hu,; Ji Liu

arXiv:2106.06135·cs.AI·June 14, 2021·21 cites

DouZero: Mastering DouDizhu with Self-Play Deep Reinforcement Learning

Daochen Zha, Jingru Xie, Wenye Ma, Sheng Zhang, Xiangru Lian, Xia Hu,, Ji Liu

PDF

Open Access 1 Repo 1 Video

TL;DR

DouZero is a deep reinforcement learning system that successfully masters DouDizhu, a complex three-player card game, by enhancing Monte-Carlo methods with neural networks, action encoding, and parallel processing, achieving state-of-the-art performance.

Contribution

The paper introduces DouZero, a novel AI system that applies deep neural networks and Monte-Carlo methods to excel in DouDizhu, a challenging imperfect-information game with large action spaces.

Findings

01

DouZero outperforms existing DouDizhu AI programs.

02

It ranks first among 344 AI agents on Botzone leaderboard.

03

The approach demonstrates classical Monte-Carlo methods can succeed in complex domains.

Abstract

Games are abstractions of the real world, where artificial agents learn to compete and cooperate with other agents. While significant achievements have been made in various perfect- and imperfect-information games, DouDizhu (a.k.a. Fighting the Landlord), a three-player card game, is still unsolved. DouDizhu is a very challenging domain with competition, collaboration, imperfect information, large state space, and particularly a massive set of possible actions where the legal actions vary significantly from turn to turn. Unfortunately, modern reinforcement learning algorithms mainly focus on simple and small action spaces, and not surprisingly, are shown not to make satisfactory progress in DouDizhu. In this work, we propose a conceptually simple yet effective DouDizhu AI system, namely DouZero, which enhances traditional Monte-Carlo methods with deep neural networks, action encoding,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

kwai/DouZero
pytorchOfficial

Videos

DouZero: Mastering DouDizhu with Self-Play Deep Reinforcement Learning· slideslive

Taxonomy

TopicsDigital Games and Media · Gambling Behavior and Treatments · Artificial Intelligence in Games

MethodsConvolution · Dense Connections · Tanh Activation · Feedforward Network · Q-Learning · Deep Q-Network · Sigmoid Activation · Long Short-Term Memory · DouZero