Convex Markov Games and Beyond: New Proof of Existence, Characterization and Learning Algorithms for Nash Equilibria
Anas Barakat, Ioannis Panageas, Antonios Varvitsiotis

TL;DR
This paper extends the theory of convex Markov games to a broader class called GUMGs, providing new existence proofs, equilibrium characterizations, and learning algorithms, including policy gradient methods with complexity guarantees.
Contribution
It introduces GUMGs, proves NE existence via fixed points, characterizes NE as stationary points, and develops policy gradient algorithms with complexity bounds.
Findings
Nash equilibria coincide with fixed points of pseudo-gradient dynamics
Existence of Markov perfect equilibria in GUMGs
Sample complexity bounds for policy gradient algorithms
Abstract
Convex Markov Games (cMGs) were recently introduced as a broad class of multi-agent learning problems that generalize Markov games to settings where strategic agents optimize general utilities beyond additive rewards. While cMGs expand the modeling frontier, their theoretical foundations, particularly the structure of Nash equilibria (NE) and guarantees for learning algorithms, are not yet well understood. In this work, we address these gaps for an extension of cMGs, which we term General Utility Markov Games (GUMGs), capturing new applications requiring coupling between agents' occupancy measures. We prove that in GUMGs, Nash equilibria coincide with the fixed points of projected pseudo-gradient dynamics (i.e., first-order stationary points), enabled by a novel agent-wise gradient domination property. This insight also yields a simple proof of NE existence using Brouwer's fixed-point…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGame Theory and Applications · Reinforcement Learning in Robotics · Advanced Bandit Algorithms Research
