A Scalable Game Theoretic Approach for Coordination of Multiple Dynamic Systems
Mostafa M. Shibl, Vijay Gupta

TL;DR
This paper introduces a scalable, local-information-based natural policy gradient method for multi-agent systems modeled as Markov potential games, enabling convergence to near-optimal policies for large, dynamic systems.
Contribution
It proposes a localized learning algorithm that reduces information requirements and scales to many agents, extending control policy convergence in dynamic, coupled systems.
Findings
Convergence to a neighborhood of optimal policies with local information.
Applicability to large-scale sensor coverage problems.
Potential for improved team coordination through decomposed cost functions.
Abstract
Learning in games provides a powerful framework to design control policies for self-interested agents that may be coupled through their dynamics, costs, or constraints. We consider the case where the dynamics of the coupled system can be modeled as a Markov potential game. In this case, distributed learning by the agents ensures that their control policies converge to a Nash equilibrium of this game. However, typical learning algorithms such as natural policy gradient require knowledge of the entire global state and actions of all the other agents, and may not be scalable as the number of agents grows. We show that by limiting the information flow to a local neighborhood of agents in the natural policy gradient algorithm, we can converge to a neighborhood of optimal policies. If the game can be designed through decomposing a global cost function of interest to a designer into local…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMulti-Agent Systems and Negotiation
