Open-ended Learning in Symmetric Zero-sum Games

David Balduzzi; Marta Garnelo; Yoram Bachrach; Wojciech M. Czarnecki,; Julien Perolat; Max Jaderberg; Thore Graepel

arXiv:1901.08106·cs.LG·May 14, 2019·46 cites

Open-ended Learning in Symmetric Zero-sum Games

David Balduzzi, Marta Garnelo, Yoram Bachrach, Wojciech M. Czarnecki,, Julien Perolat, Max Jaderberg, Thore Graepel

PDF

Open Access

TL;DR

This paper introduces a geometric framework and a new algorithm, PSRO_rN, for open-ended learning in zero-sum games, especially nontransitive ones, enabling the creation of diverse and increasingly strong agent populations.

Contribution

The paper presents a novel geometric framework for agent objectives and a new algorithm, PSRO_rN, that enhances open-ended learning in nontransitive zero-sum games.

Findings

01

PSRO_rN outperforms existing algorithms in nontransitive resource allocation games.

02

The framework enables reasoning about population performance in complex game dynamics.

03

PSRO_rN produces more diverse and stronger agent populations.

Abstract

Zero-sum games such as chess and poker are, abstractly, functions that evaluate pairs of agents, for example labeling them `winner' and `loser'. If the game is approximately transitive, then self-play generates sequences of agents of increasing strength. However, nontransitive games, such as rock-paper-scissors, can exhibit strategic cycles, and there is no longer a clear objective -- we want agents to increase in strength, but against whom is unclear. In this paper, we introduce a geometric framework for formulating agent objectives in zero-sum games, in order to construct adaptive sequences of objectives that yield open-ended learning. The framework allows us to reason about population performance in nontransitive games, and enables the development of a new algorithm (rectified Nash response, PSRO_rN) that uses game-theoretic niching to construct diverse populations of effective…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGame Theory and Applications · Machine Learning and Algorithms · Advanced Bandit Algorithms Research