Stochastic Regret Minimization in Extensive-Form Games

Gabriele Farina; Christian Kroer; and Tuomas Sandholm

arXiv:2002.08493·cs.GT·February 21, 2020·5 cites

Stochastic Regret Minimization in Extensive-Form Games

Gabriele Farina, Christian Kroer, and Tuomas Sandholm

PDF

Open Access 1 Video

TL;DR

This paper introduces a flexible framework for stochastic regret minimization in extensive-form games, enhancing theoretical convergence and enabling new algorithms that outperform MCCFR in experiments.

Contribution

It develops a general framework that integrates any regret-minimization algorithm with gradient estimators, extending beyond MCCFR and improving theoretical and empirical performance.

Findings

01

New stochastic methods outperform MCCFR in experiments

02

Framework provides stronger convergence guarantees

03

Analysis simplifies understanding of MCCFR's properties

Abstract

Monte-Carlo counterfactual regret minimization (MCCFR) is the state-of-the-art algorithm for solving sequential games that are too large for full tree traversals. It works by using gradient estimates that can be computed via sampling. However, stochastic methods for sequential games have not been investigated extensively beyond MCCFR. In this paper we develop a new framework for developing stochastic regret minimization methods. This framework allows us to use any regret-minimization algorithm, coupled with any gradient estimator. The MCCFR algorithm can be analyzed as a special case of our framework, and this analysis leads to significantly-stronger theoretical on convergence, while simultaneously yielding a simplified proof. Our framework allows us to instantiate several new stochastic methods for solving sequential games. We show extensive experiments on three games, where some…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Stochastic Regret Minimization in Extensive-Form Games· slideslive

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Artificial Intelligence in Games · Reinforcement Learning in Robotics