Finding and Certifying (Near-)Optimal Strategies in Black-Box   Extensive-Form Games

Brian Hu Zhang; Tuomas Sandholm

arXiv:2009.07384·cs.GT·March 18, 2021·1 cites

Finding and Certifying (Near-)Optimal Strategies in Black-Box Extensive-Form Games

Brian Hu Zhang, Tuomas Sandholm

PDF

Open Access 1 Video

TL;DR

This paper develops methods to find and certify near-optimal strategies in large, complex black-box extensive-form games using only game-play, without full game tree expansion, providing exploitability guarantees and convergence guarantees.

Contribution

It introduces a new approach that relaxes previous assumptions, enabling certification and equilibrium-finding using only game-play and regret minimization.

Findings

01

Achieves high-probability exploitability certificates with minimal game tree expansion.

02

Provides an equilibrium-finding algorithm with $ ilde O(1/ oot 2 T)$ convergence rate.

03

Demonstrates practical effectiveness in black-box settings through experiments.

Abstract

Often -- for example in war games, strategy video games, and financial simulations -- the game is given to us only as a black-box simulator in which we can play it. In these settings, since the game may have unknown nature action distributions (from which we can only obtain samples) and/or be too large to expand fully, it can be difficult to compute strategies with guarantees on exploitability. Recent work \cite{Zhang20:Small} resulted in a notion of certificate for extensive-form games that allows exploitability guarantees while not expanding the full game tree. However, that work assumed that the black box could sample or expand arbitrary nodes of the game tree at any time, and that a series of exact game solves (via, for example, linear programming) can be conducted to compute the certificate. Each of those two assumptions severely restricts the practical applicability of that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Finding and Certifying (Near-)Optimal Strategies in Black-Box Extensive-Form Games· underline

Taxonomy

TopicsArtificial Intelligence in Games · Adversarial Robustness in Machine Learning · Advanced Bandit Algorithms Research