Optimism Without Regularization: Constant Regret in Zero-Sum Games
John Lazarsfeld, Georgios Piliouras, Ryann Sim, Stratis Skoulakis

TL;DR
This paper demonstrates that optimistic fictitious play can achieve constant regret in two-player zero-sum games without regularization, challenging prior assumptions about the necessity of regularization for fast learning.
Contribution
It proves for the first time that unregularized optimistic fictitious play attains constant regret in two-strategy zero-sum games, using a novel geometric proof technique.
Findings
Optimistic Fictitious Play achieves constant regret without regularization.
Regularized algorithms like Optimistic FTRL are not the only means for fast learning.
Alternating Fictitious Play has a regret lower bound of a8(T).
Abstract
This paper studies the optimistic variant of Fictitious Play for learning in two-player zero-sum games. While it is known that Optimistic FTRL -- a regularized algorithm with a bounded stepsize parameter -- obtains constant regret in this setting, we show for the first time that similar, optimal rates are also achievable without regularization: we prove for two-strategy games that Optimistic Fictitious Play (using any tiebreaking rule) obtains only constant regret, providing surprising new evidence on the ability of non-no-regret algorithms for fast learning in games. Our proof technique leverages a geometric view of Optimistic Fictitious Play in the dual space of payoff vectors, where we show a certain energy function of the iterates remains bounded over time. Additionally, we also prove a regret lower bound of for Alternating Fictitious Play. In the unregularized…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Game Theory and Applications · Stochastic Gradient Optimization Techniques
