Generalized Bandit Regret Minimizer Framework in Imperfect Information   Extensive-Form Game

Linjian Meng; Yang Gao

arXiv:2203.05920·cs.LG·August 21, 2023

Generalized Bandit Regret Minimizer Framework in Imperfect Information Extensive-Form Game

Linjian Meng, Yang Gao

PDF

Open Access

TL;DR

This paper introduces a generalized framework for regret minimization in imperfect information games with bandit feedback, enabling more efficient learning of approximate Nash equilibria.

Contribution

It proposes a modular theoretical framework for bandit regret minimization, analyzes existing methods as special cases, and introduces a novel, more efficient algorithm SIX-OMD.

Findings

01

SIX-OMD significantly improves convergence rates over previous methods.

02

The framework unifies analysis of various bandit regret minimization algorithms.

03

SIX-OMD is computationally efficient, requiring only current and average strategy updates.

Abstract

Regret minimization methods are a powerful tool for learning approximate Nash equilibrium (NE) in two-player zero-sum imperfect information extensive-form games (IIEGs). We consider the problem in the interactive bandit-feedback setting where we don't know the dynamics of the IIEG. In general, only the interactive trajectory and the reached terminal node value $v (z^{t})$ are revealed. To learn NE, the regret minimizer is required to estimate the full-feedback loss gradient $ℓ^{t}$ by $v (z^{t})$ and minimize the regret. In this paper, we propose a generalized framework for this learning setting. It presents a theoretical framework for the design and the modular analysis of the bandit regret minimization methods. We demonstrate that the most recent bandit regret minimization methods can be analyzed as a particular case of our framework. Following this framework, we describe a novel method…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Game Theory and Applications