Near-Optimal No-Regret Learning for Correlated Equilibria in   Multi-Player General-Sum Games

Ioannis Anagnostides; Constantinos Daskalakis; Gabriele Farina,; Maxwell Fishelson; Noah Golowich; Tuomas Sandholm

arXiv:2111.06008·cs.LG·January 26, 2023

Near-Optimal No-Regret Learning for Correlated Equilibria in Multi-Player General-Sum Games

Ioannis Anagnostides, Constantinos Daskalakis, Gabriele Farina,, Maxwell Fishelson, Noah Golowich, Tuomas Sandholm

PDF

Open Access

TL;DR

This paper extends recent no-regret learning results in multi-player games to internal and swap regret, achieving near-optimal convergence rates to correlated equilibria using novel techniques for analyzing fixed point operations.

Contribution

It introduces new methods to establish higher-order smoothness in learning dynamics, improving convergence rates for correlated equilibria and analyzing classic algorithms like BM.

Findings

01

Achieves $ ilde{O}(T^{-1})$ convergence to correlated equilibrium.

02

Establishes $O( extrm{polylog}(T))$ no-swap-regret bound for Blum and Mansour's algorithm.

03

Develops techniques for higher-order smoothness in fixed point based learning dynamics.

Abstract

Recently, Daskalakis, Fishelson, and Golowich (DFG) (NeurIPS`21) showed that if all agents in a multi-player general-sum normal-form game employ Optimistic Multiplicative Weights Update (OMWU), the external regret of every player is $O (polylog (T))$ after $T$ repetitions of the game. We extend their result from external regret to internal regret and swap regret, thereby establishing uncoupled learning dynamics that converge to an approximate correlated equilibrium at the rate of $\tilde{O} (T^{- 1})$ . This substantially improves over the prior best rate of convergence for correlated equilibria of $O (T^{- 3/4})$ due to Chen and Peng (NeurIPS`20), and it is optimal -- within the no-regret framework -- up to polylogarithmic factors in $T$ . To obtain these results, we develop new techniques for establishing higher-order smoothness for learning dynamics involving fixed point…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Markov Chains and Monte Carlo Methods · Bayesian Modeling and Causal Inference