Uncoupled Learning Dynamics with $O(\log T)$ Swap Regret in Multiplayer   Games

Ioannis Anagnostides; Gabriele Farina; Christian Kroer; Chung-Wei Lee,; Haipeng Luo; Tuomas Sandholm

arXiv:2204.11417·cs.GT·October 7, 2022·1 cites

Uncoupled Learning Dynamics with $O(\log T)$ Swap Regret in Multiplayer Games

Ioannis Anagnostides, Gabriele Farina, Christian Kroer, Chung-Wei Lee,, Haipeng Luo, Tuomas Sandholm

PDF

Open Access 1 Video

TL;DR

This paper introduces uncoupled learning dynamics for multiplayer games that achieve near-logarithmic swap regret bounds, improving previous results and also maintaining optimal regret in adversarial settings.

Contribution

The paper presents a novel uncoupled learning dynamics with time-invariant rates that bounds second-order path lengths by O(log T), leading to improved swap regret bounds in multiplayer games.

Findings

01

Achieves O(log T) swap regret in multiplayer games

02

Maintains O(√T) swap regret in adversarial regimes

03

Uses a novel combination of optimistic regularization and self-concordant barriers

Abstract

In this paper we establish efficient and \emph{uncoupled} learning dynamics so that, when employed by all players in a general-sum multiplayer game, the \emph{swap regret} of each player after $T$ repetitions of the game is bounded by $O (lo g T)$ , improving over the prior best bounds of $O (lo g^{4} (T))$ . At the same time, we guarantee optimal $O (T)$ swap regret in the adversarial regime as well. To obtain these results, our primary contribution is to show that when all players follow our dynamics with a \emph{time-invariant} learning rate, the \emph{second-order path lengths} of the dynamics up to time $T$ are bounded by $O (lo g T)$ , a fundamental property which could have further implications beyond near-optimally bounding the (swap) regret. Our proposed learning dynamics combine in a novel way \emph{optimistic} regularized learning with the use of \emph{self-concordant…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Uncoupled Learning Dynamics with $O(\log T)$ Swap Regret in Multiplayer Games· slideslive

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Advanced Bandit Algorithms Research · Machine Learning and Algorithms