Efficient Phi-Regret Minimization in Extensive-Form Games via Online   Mirror Descent

Yu Bai; Chi Jin; Song Mei; Ziang Song; Tiancheng Yu

arXiv:2205.15294·cs.LG·October 28, 2022·1 cites

Efficient Phi-Regret Minimization in Extensive-Form Games via Online Mirror Descent

Yu Bai, Chi Jin, Song Mei, Ziang Song, Tiancheng Yu

PDF

Open Access 1 Video

TL;DR

This paper introduces a polynomial-time online mirror descent approach for learning equilibria in extensive-form games, achieving optimal regret bounds and connecting game-theoretic algorithms with mirror descent techniques.

Contribution

It establishes a novel equivalence between Phi-Hedge algorithms and online mirror descent for EFGs, enabling efficient equilibrium learning with improved regret guarantees.

Findings

01

Polynomial-time algorithms for EFCE with optimal regret.

02

Equivalence between Phi-Hedge and OMD in EFGs.

03

Achieved matching lower bounds for bandit-feedback regret.

Abstract

A conceptually appealing approach for learning Extensive-Form Games (EFGs) is to convert them to Normal-Form Games (NFGs). This approach enables us to directly translate state-of-the-art techniques and analyses in NFGs to learning EFGs, but typically suffers from computational intractability due to the exponential blow-up of the game size introduced by the conversion. In this paper, we address this problem in natural and important setups for the \emph{ $Φ$ -Hedge} algorithm -- A generic algorithm capable of learning a large class of equilibria for NFGs. We show that $Φ$ -Hedge can be directly used to learn Nash Equilibria (zero-sum settings), Normal-Form Coarse Correlated Equilibria (NFCCE), and Extensive-Form Correlated Equilibria (EFCE) in EFGs. We prove that, in those settings, the \emph{ $Φ$ -Hedge} algorithms are equivalent to standard Online Mirror Descent (OMD) algorithms for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Efficient Phi-Regret Minimization in Extensive-Form Games via Online Mirror Descent· slideslive

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Reinforcement Learning in Robotics · Auction Theory and Applications