Finite-Sample Guarantees for Learning Dynamics in Zero-Sum Polymatrix Games

Fathima Zarin Faizal; Asuman Ozdaglar; Martin J. Wainwright

arXiv:2407.20128·math.OC·August 13, 2025

Finite-Sample Guarantees for Learning Dynamics in Zero-Sum Polymatrix Games

Fathima Zarin Faizal, Asuman Ozdaglar, Martin J. Wainwright

PDF

Open Access

TL;DR

This paper establishes finite-sample convergence guarantees for learning dynamics in zero-sum polymatrix games under two information scenarios, using a two-timescale approach combining smoothed best-response and TD-learning.

Contribution

It introduces a novel two-timescale learning dynamic with finite-sample guarantees for zero-sum polymatrix games, especially in the minimal information setting.

Findings

01

Polynomial-time convergence to ε-Nash equilibrium

02

Finite-sample guarantees established for both information settings

03

Effective learning dynamics without additional exploration

Abstract

We study best-response type learning dynamics for zero-sum polymatrix games under two information settings. The two settings are distinguished by the type of information that each player has about the game and their opponents' strategy. The first setting is the full information case, in which each player knows their own and their opponents' payoff matrices and observes everyone's mixed strategies. The second setting is the minimal information case, where players do not observe their opponents' strategies and are not aware of any payoff matrices (instead they only observe their realized payoffs). For this setting, also known as the radically uncoupled case in the learning in games literature, we study a two-timescale learning dynamics that combine smoothed best-response type updates for strategy estimates with a TD-learning update to estimate a local payoff function. For these dynamics,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research

MethodsAttentive Walk-Aggregating Graph Neural Network