Automatically Reinforcing a Game AI

David L. St-Pierre; Jean-Baptiste Hoock; Jialin Liu; Fabien Teytaud; and Olivier Teytaud

arXiv:1607.08100·cs.AI·July 28, 2016·2 cites

Automatically Reinforcing a Game AI

David L. St-Pierre, Jean-Baptiste Hoock, Jialin Liu, Fabien Teytaud, and Olivier Teytaud

PDF

Open Access

TL;DR

This paper explores portfolio methods to enhance game-playing AI by decomposing a single GPP into multiple variants and training them offline or online, resulting in more robust and stronger game AI performance.

Contribution

It introduces two offline portfolio approaches, BestArm and Nash-portfolio, and an online bandit-based method to improve game AI robustness and strength.

Findings

01

Nash-portfolio is more robust against learning opponents.

02

Offline methods outperform the original GPP in certain scenarios.

03

Online bandit approach adapts effectively to game conditions.

Abstract

A recent research trend in Artificial Intelligence (AI) is the combination of several programs into one single, stronger, program; this is termed portfolio methods. We here investigate the application of such methods to Game Playing Programs (GPPs). In addition, we consider the case in which only one GPP is available - by decomposing this single GPP into several ones through the use of parameters or even simply random seeds. These portfolio methods are trained in a learning phase. We propose two different offline approaches. The simplest one, BestArm, is a straightforward optimization of seeds or parame- ters; it performs quite well against the original GPP, but performs poorly against an opponent which repeats games and learns. The second one, namely Nash-portfolio, performs similarly in a "one game" test, and is much more robust against an opponent who learns. We also propose an…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Artificial Intelligence in Games · Reinforcement Learning in Robotics