Learning to Play Imperfect-Information Games by Imitating an Oracle   Planner

Rinu Boney; Alexander Ilin; Juho Kannala; Jarno Sepp\"anen

arXiv:2012.12186·cs.AI·December 23, 2020

Learning to Play Imperfect-Information Games by Imitating an Oracle Planner

Rinu Boney, Alexander Ilin, Juho Kannala, Jarno Sepp\"anen

PDF

1 Repo

TL;DR

This paper introduces a model-based approach where an oracle planner with full state access guides the training of an agent in complex imperfect-information games, enabling efficient strategy learning with limited data.

Contribution

It presents a novel method of using an oracle planner to distill strategies into a learning agent for large imperfect-information games, improving over model-free methods.

Findings

01

Planner with fixed-depth search and Thompson sampling outperforms naive Monte Carlo in large action spaces.

02

The follower policy learns effective strategies after training on a few hundred battles.

03

The approach successfully applies to complex games like Clash Royale and Pommerman.

Abstract

We consider learning to play multiplayer imperfect-information games with simultaneous moves and large state-action spaces. Previous attempts to tackle such challenging games have largely focused on model-free learning methods, often requiring hundreds of years of experience to produce competitive agents. Our approach is based on model-based planning. We tackle the problem of partial observability by first building an (oracle) planner that has access to the full state of the environment and then distilling the knowledge of the oracle to a (follower) agent which is trained to play the imperfect-information game by imitating the oracle's choices. We experimentally show that planning with naive Monte Carlo tree search does not perform very well in large combinatorial action spaces. We therefore propose planning with a fixed-depth tree search and decoupled Thompson sampling for action…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

rinuboney/l2p-pommerman
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.