Alternating Mirror Descent for Constrained Min-Max Games

Andre Wibisono; Molei Tao; Georgios Piliouras

arXiv:2206.04160·cs.GT·June 10, 2022

Alternating Mirror Descent for Constrained Min-Max Games

Andre Wibisono, Molei Tao, Georgios Piliouras

PDF

Open Access 1 Video

TL;DR

This paper introduces an alternating mirror descent algorithm for constrained min-max games, demonstrating improved convergence properties over simultaneous methods by analyzing its regret bounds and behavior in constrained and unconstrained settings.

Contribution

It proposes and analyzes an alternating mirror descent algorithm for constrained zero-sum games, showing better convergence than simultaneous methods and connecting to existing gradient descent results.

Findings

01

Achieves an $O(K^{-2/3})$ regret bound for the proposed algorithm.

02

Demonstrates divergence of simultaneous mirror descent in constrained settings.

03

Recovers known results for unconstrained zero-sum games with alternating gradient descent.

Abstract

In this paper we study two-player bilinear zero-sum games with constrained strategy spaces. An instance of natural occurrences of such constraints is when mixed strategies are used, which correspond to a probability simplex constraint. We propose and analyze the alternating mirror descent algorithm, in which each player takes turns to take action following the mirror descent algorithm for constrained optimization. We interpret alternating mirror descent as an alternating discretization of a skew-gradient flow in the dual space, and use tools from convex optimization and modified energy function to establish an $O (K^{- 2/3})$ bound on its average regret after $K$ iterations. This quantitatively verifies the algorithm's better behavior than the simultaneous version of mirror descent algorithm, which is known to diverge and yields an $O (K^{- 1/2})$ average regret bound. In the special case…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Alternating Mirror Descent for Constrained Min-Max Games· slideslive

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Reinforcement Learning in Robotics · Markov Chains and Monte Carlo Methods