Convergence of Time-Averaged Mean Field Gradient Descent Dynamics for Continuous Multi-Player Zero-Sum Games
Yulong Lu, Pierre Monmarch\'e

TL;DR
This paper introduces a mean-field gradient descent method with momentum for multi-player zero-sum games, proving exponential convergence to mixed Nash equilibria and enabling annealing to unregularized equilibria.
Contribution
It develops a unified mean-field gradient descent approach with exponential convergence guarantees for multi-player zero-sum games, improving upon previous polynomial rates.
Findings
Proves exponential convergence to MNE with fixed regularization.
Introduces a single time-scale approach for all player types.
Demonstrates convergence to unregularized MNE via simulated annealing.
Abstract
The approximation of mixed Nash equilibria (MNE) for zero-sum games with mean-field interacting players has recently raised much interest in machine learning. In this paper we propose a mean-field gradient descent dynamics for finding the MNE of zero-sum games involving players with . The evolution of the players' strategy distributions follows coupled mean-field gradient descent flows with momentum, incorporating an exponentially discounted time-averaging of gradients. First, in the case of a fixed entropic regularization, we prove an exponential convergence rate for the mean-field dynamics to the mixed Nash equilibrium with respect to the total variation metric. This improves a previous polynomial convergence rate for a similar time-averaged dynamics with different averaging factors. Moreover, unlike previous two-scale approaches for finding the MNE, our approach treats…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Game Theory and Applications · Advanced Bandit Algorithms Research
