Adapting to game trees in zero-sum imperfect information games
C\^ome Fiegel, Pierre M\'enard, Tadashi Kozuno, R\'emi Munos, Vianney, Perchet, Michal Valko

TL;DR
This paper investigates learning near-optimal strategies in zero-sum imperfect information games through self-play, establishing lower bounds and proposing two algorithms that adapt to game structure and observations.
Contribution
It provides a problem-independent lower bound on sample complexity and introduces two FTRL algorithms, one requiring prior game structure knowledge and the other adapting online.
Findings
Lower bound of (H(A_X+B_Y))/(. )^2 on realizations needed
Balanced FTRL matches the lower bound but needs game structure knowledge
Adaptive FTRL achieves near-optimal sample complexity without prior structure knowledge
Abstract
Imperfect information games (IIG) are games in which each player only partially observes the current game state. We study how to learn -optimal strategies in a zero-sum IIG through self-play with trajectory feedback. We give a problem-independent lower bound on the required number of realizations to learn these strategies with high probability, where is the length of the game, and are the total number of actions for the two players. We also propose two Follow the Regularized leader (FTRL) algorithms for this setting: Balanced FTRL which matches this lower bound, but requires the knowledge of the information set structure beforehand to define the regularization; and Adaptive FTRL which needs …
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsArtificial Intelligence in Games · Advanced Bandit Algorithms Research · Reinforcement Learning in Robotics
