Fast Last-Iterate Convergence of Learning in Games Requires Forgetful   Algorithms

Yang Cai; Gabriele Farina; Julien Grand-Cl\'ement; Christian Kroer,; Chung-Wei Lee; Haipeng Luo; Weiqiang Zheng

arXiv:2406.10631·cs.GT·January 22, 2025

Fast Last-Iterate Convergence of Learning in Games Requires Forgetful Algorithms

Yang Cai, Gabriele Farina, Julien Grand-Cl\'ement, Christian Kroer,, Chung-Wei Lee, Haipeng Luo, Weiqiang Zheng

PDF

Open Access

TL;DR

This paper investigates the last-iterate convergence of algorithms like OMWU in two-player zero-sum games, revealing that many such algorithms suffer from slow convergence due to their inability to forget past information quickly.

Contribution

The paper proves that a broad class of algorithms, including OMWU, inherently have slow last-iterate convergence in certain games, highlighting a fundamental limitation.

Findings

01

OMWU and similar algorithms can have constant duality gap even after many rounds

02

Slow convergence is inherent for algorithms that do not forget past information quickly

03

The analysis applies to a broad class of optimistic algorithms

Abstract

Self-play via online learning is one of the premier ways to solve large-scale two-player zero-sum games, both in theory and practice. Particularly popular algorithms include optimistic multiplicative weights update (OMWU) and optimistic gradient-descent-ascent (OGDA). While both algorithms enjoy $O (1/ T)$ ergodic convergence to Nash equilibrium in two-player zero-sum games, OMWU offers several advantages including logarithmic dependence on the size of the payoff matrix and $O (1/ T)$ convergence to coarse correlated equilibria even in general-sum games. However, in terms of last-iterate convergence in two-player zero-sum games, an increasingly popular topic in this area, OGDA guarantees that the duality gap shrinks at a rate of $O (1/ T)$ , while the best existing last-iterate convergence for OMWU depends on some game-dependent constant that could be arbitrarily large. This…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Stochastic Gradient Optimization Techniques