Application of Self-Play Reinforcement Learning to a Four-Player Game of   Imperfect Information

Henry Charlesworth

arXiv:1808.10442·cs.LG·September 3, 2018·6 cites

Application of Self-Play Reinforcement Learning to a Four-Player Game of Imperfect Information

Henry Charlesworth

PDF

Open Access 2 Repos

TL;DR

This paper presents a self-play reinforcement learning approach using Proximal Policy Optimization to train an agent for the complex four-player imperfect information card game Big 2, achieving superhuman performance without tree search.

Contribution

It introduces a new environment for Big 2 and demonstrates that PPO-based self-play can effectively learn competitive strategies in complex imperfect information games.

Findings

01

Agent outperforms amateur human players

02

Learns efficiently without tree search

03

Effective in a complex four-player game

Abstract

We introduce a new virtual environment for simulating a card game known as "Big 2". This is a four-player game of imperfect information with a relatively complicated action space (being allowed to play 1,2,3,4 or 5 card combinations from an initial starting hand of 13 cards). As such it poses a challenge for many current reinforcement learning methods. We then use the recently proposed "Proximal Policy Optimization" algorithm to train a deep neural network to play the game, purely learning via self-play, and find that it is able to reach a level which outperforms amateur human players after only a relatively short amount of training time and without needing to search a tree of future game states.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Games · Reinforcement Learning in Robotics · Sports Analytics and Performance