Forward Looking Best-Response Multiplicative Weights Update Methods for   Bilinear Zero-sum Games

Michail Fasoulakis; Evangelos Markakis; Yannis Pantazis; Constantinos; Varsos

arXiv:2106.03579·cs.GT·March 9, 2022

Forward Looking Best-Response Multiplicative Weights Update Methods for Bilinear Zero-sum Games

Michail Fasoulakis, Evangelos Markakis, Yannis Pantazis, Constantinos, Varsos

PDF

Open Access

TL;DR

This paper introduces a novel extra gradient learning algorithm for bilinear zero-sum games that guarantees last-iterate convergence to Nash equilibria, outperforming existing methods in convergence speed and practical efficiency.

Contribution

The paper proposes a new variant of optimistic mirror descent with large intermediate steps, ensuring convergence to Nash equilibria in bilinear zero-sum games, with theoretical guarantees and practical improvements.

Findings

01

Guarantees last-iterate convergence to Nash equilibrium.

02

Achieves faster convergence compared to existing methods.

03

Experimental results show significant practical acceleration.

Abstract

Our work focuses on extra gradient learning algorithms for finding Nash equilibria in bilinear zero-sum games. The proposed method, which can be formally considered as a variant of Optimistic Mirror Descent \cite{DBLP:conf/iclr/MertikopoulosLZ19}, uses a large learning rate for the intermediate gradient step which essentially leads to computing (approximate) best response strategies against the profile of the previous iteration. Although counter-intuitive at first sight due to the irrationally large, for an iterative algorithm, intermediate learning step, we prove that the method guarantees last-iterate convergence to an equilibrium. Particularly, we show that the algorithm reaches first an $η^{1/ ρ}$ -approximate Nash equilibrium, with $ρ > 1$ , by decreasing the Kullback-Leibler divergence of each iterate by at least $Ω (η^{1 + \frac{1}{ρ}})$ , for sufficiently small…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Stochastic Gradient Optimization Techniques · Machine Learning and Algorithms