Almost Sure Convergence of Networked Policy Gradient over Time-Varying Networks in Markov Potential Games

Sarper Aydin; Ceyhun Eksin

arXiv:2410.20075·eess.SY·October 2, 2025

Almost Sure Convergence of Networked Policy Gradient over Time-Varying Networks in Markov Potential Games

Sarper Aydin, Ceyhun Eksin

PDF

Open Access

TL;DR

This paper introduces a networked policy gradient method for Markov potential games with time-varying communication networks, proving almost sure convergence to stationary points without bounded gradient assumptions.

Contribution

It presents a novel convergence proof for networked policy gradient in Markov potential games, accommodating time-varying networks and removing previous bounded gradient constraints.

Findings

01

Convergence to stationary points is proven with rate O(1/ε²).

02

Numerical experiments show convergence of local beliefs and gradients.

03

Networked policy gradient achieves higher rewards than independent updates.

Abstract

We propose networked policy gradient play for solving Markov potential games with continuous and/or discrete state-action pairs. During the game, agents use parametrized and differentiable policies that depend on the current state and the policy parameters of other agents. During training, agents update their policy parameters following stochastic gradients. The gradient estimation involves two consecutive episodes, generating unbiased estimators of reward and policy score functions. In addition, it involves keeping estimates of others' parameters using consensus steps given local estimates received through a time-varying communication network. In Markov potential games, there exists a potential value function among agents with gradients corresponding to the gradients of local value functions. Using this structure, we prove almost sure convergence to a stationary point of the potential…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAge of Information Optimization · Advanced Wireless Network Optimization · Distributed Sensor Networks and Detection Algorithms