Fast and Furious Learning in Zero-Sum Games: Vanishing Regret with   Non-Vanishing Step Sizes

James P. Bailey; Georgios Piliouras

arXiv:1905.04532·cs.GT·May 14, 2019·6 cites

Fast and Furious Learning in Zero-Sum Games: Vanishing Regret with Non-Vanishing Step Sizes

James P. Bailey, Georgios Piliouras

PDF

Open Access

TL;DR

This paper demonstrates that in zero-sum games, gradient descent with fixed step sizes can achieve vanishing average regret and convergence to Nash equilibrium, challenging previous beliefs about the necessity of diminishing step sizes.

Contribution

It introduces the concept of 'fast and furious' learning, showing fixed step sizes can yield optimal regret bounds in simple zero-sum games without prior horizon knowledge.

Findings

01

Achieves BCBTD regret with fixed step sizes

02

Convergence of strategies to Nash equilibrium

03

Applicable to simple two-agent zero-sum games

Abstract

We show for the first time, to our knowledge, that it is possible to reconcile in online learning in zero-sum games two seemingly contradictory objectives: vanishing time-average regret and non-vanishing step sizes. This phenomenon, that we coin ``fast and furious" learning in games, sets a new benchmark about what is possible both in max-min optimization as well as in multi-agent systems. Our analysis does not depend on introducing a carefully tailored dynamic. Instead we focus on the most well studied online dynamic, gradient descent. Similarly, we focus on the simplest textbook class of games, two-agent two-strategy zero-sum games, such as Matching Pennies. Even for this simplest of benchmarks the best known bound for total regret, prior to our work, was the trivial one of $O (T)$ , which is immediately applicable even to a non-learning agent. Based on a tight understanding of the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Reinforcement Learning in Robotics