Combining Deep Reinforcement Learning and Search for   Imperfect-Information Games

Noam Brown; Anton Bakhtin; Adam Lerer; Qucheng Gong

arXiv:2007.13544·cs.GT·December 1, 2020·62 cites

Combining Deep Reinforcement Learning and Search for Imperfect-Information Games

Noam Brown, Anton Bakhtin, Adam Lerer, Qucheng Gong

PDF

Open Access 1 Repo 2 Videos

TL;DR

This paper introduces ReBeL, a reinforcement learning framework combining search that converges to Nash equilibrium in two-player zero-sum imperfect-information games, demonstrating superhuman poker performance with less domain knowledge.

Contribution

ReBeL is a novel general framework that extends deep reinforcement learning and search to imperfect-information games, with provable convergence to Nash equilibrium.

Findings

01

ReBeL converges to an approximate Nash equilibrium in imperfect-information games.

02

ReBeL achieves superhuman performance in heads-up no-limit Texas hold'em poker.

03

ReBeL requires less domain knowledge than previous poker AIs.

Abstract

The combination of deep reinforcement learning and search at both training and test time is a powerful paradigm that has led to a number of successes in single-agent settings and perfect-information games, best exemplified by AlphaZero. However, prior algorithms of this form cannot cope with imperfect-information games. This paper presents ReBeL, a general framework for self-play reinforcement learning and search that provably converges to a Nash equilibrium in any two-player zero-sum game. In the simpler setting of perfect-information games, ReBeL reduces to an algorithm similar to AlphaZero. Results in two different imperfect-information games show ReBeL converges to an approximate Nash equilibrium. We also show ReBeL achieves superhuman performance in heads-up no-limit Texas hold'em poker, while using far less domain knowledge than any prior poker AI.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

facebookresearch/rebel
noneOfficial

Videos

ReBeL - Combining Deep Reinforcement Learning and Search for Imperfect-Information Games (Explained)· youtube

Combining Deep Reinforcement Learning and Search for Imperfect-Information Games· slideslive

Taxonomy

TopicsArtificial Intelligence in Games · Reinforcement Learning in Robotics · Digital Games and Media

MethodsAlphaZero