Solving Zero-Sum Convex Markov Games

Fivos Kalogiannis; Emmanouil-Vasileios Vlatakis-Gkaragkounis; Ian Gemp; Georgios Piliouras

arXiv:2506.16120·cs.GT·June 23, 2025

Solving Zero-Sum Convex Markov Games

Fivos Kalogiannis, Emmanouil-Vasileios Vlatakis-Gkaragkounis, Ian Gemp, Georgios Piliouras

PDF

Open Access 1 Video

TL;DR

This paper proves the first global convergence guarantees for independent policy gradient methods in two-player zero-sum convex Markov games, a broad class of multi-agent strategic models, using novel regularization techniques.

Contribution

It introduces a regularization approach that transforms the nonconvex min-max problem into a form with provable convergence guarantees for policy gradient algorithms.

Findings

01

Proves convergence of policy gradient methods to Nash equilibria in convex Markov games.

02

Develops a regularization technique stabilizing policy updates in complex multi-agent settings.

03

Provides the first global convergence guarantees for stochastic nested and alternating gradient methods.

Abstract

We contribute the first provable guarantees of global convergence to Nash equilibria (NE) in two-player zero-sum convex Markov games (cMGs) by using independent policy gradient methods. Convex Markov games, recently defined by Gemp et al. (2024), extend Markov decision processes to multi-agent settings with preferences that are convex over occupancy measures, offering a broad framework for modeling generic strategic interactions. However, even the fundamental min-max case of cMGs presents significant challenges, including inherent nonconvexity, the absence of Bellman consistency, and the complexity of the infinite horizon. We follow a two-step approach. First, leveraging properties of hidden-convex--hidden-concave functions, we show that a simple nonconvex regularization transforms the min-max optimization problem into a nonconvex-proximal Polyak-Lojasiewicz (NC-pPL) objective.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Solving Zero-Sum Convex Markov Games· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics · Stochastic Gradient Optimization Techniques · Game Theory and Applications