DouZero+: Improving DouDizhu AI by Opponent Modeling and Coach-guided   Learning

Youpeng Zhao; Jian Zhao; Xunhan Hu; Wengang Zhou; Houqiang Li

arXiv:2204.02558·cs.AI·April 7, 2022·1 cites

DouZero+: Improving DouDizhu AI by Opponent Modeling and Coach-guided Learning

Youpeng Zhao, Jian Zhao, Xunhan Hu, Wengang Zhou, Houqiang Li

PDF

Open Access 1 Repo

TL;DR

This paper enhances the DouZero AI for DouDizhu by integrating opponent modeling and a coach network, leading to superior performance and top leaderboard ranking among over 400 agents.

Contribution

It introduces opponent modeling and a coach network to improve DouZero's performance and training efficiency in DouDizhu.

Findings

01

Achieved top ranking on Botzone leaderboard.

02

Improved performance over original DouZero.

03

Enhanced training speed and effectiveness.

Abstract

Recent years have witnessed the great breakthrough of deep reinforcement learning (DRL) in various perfect and imperfect information games. Among these games, DouDizhu, a popular card game in China, is very challenging due to the imperfect information, large state space, elements of collaboration and a massive number of possible moves from turn to turn. Recently, a DouDizhu AI system called DouZero has been proposed. Trained using traditional Monte Carlo method with deep neural networks and self-play procedure without the abstraction of human prior knowledge, DouZero has outperformed all the existing DouDizhu AI programs. In this work, we propose to enhance DouZero by introducing opponent modeling into DouZero. Besides, we propose a novel coach network to further boost the performance of DouZero and accelerate its training process. With the integration of the above two techniques into…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

submit-paper/doudizhu
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGambling Behavior and Treatments

MethodsQ-Learning · Convolution · Dense Connections · Deep Q-Network · Tanh Activation · Feedforward Network · Sigmoid Activation · Long Short-Term Memory · DouZero