Scalable Planning in Multi-Agent MDPs
Dinuka Sahabandu, Luyao Niu, Andrew Clark, Radha Poovendran

TL;DR
This paper introduces a new approximate transition dependence measure for multi-agent MDPs, enabling scalable planning with provable bounds on optimality in large multi-agent systems.
Contribution
It proposes $ ext{ extdelta}$-transition dependence, a metric to quantify deviation from transition independence, and develops a polynomial-time algorithm with theoretical guarantees.
Findings
Algorithm achieves near-optimal solutions for large multi-agent MDPs.
Effective in multi-robot control and patrolling scenarios.
Provides a scalable approach with provable bounds.
Abstract
Multi-agent Markov Decision Processes (MMDPs) arise in a variety of applications including target tracking, control of multi-robot swarms, and multiplayer games. A key challenge in MMDPs occurs when the state and action spaces grow exponentially in the number of agents, making computation of an optimal policy computationally intractable for medium- to large-scale problems. One property that has been exploited to mitigate this complexity is transition independence, in which each agent's transition probabilities are independent of the states and actions of other agents. Transition independence enables factorization of the MMDP and computation of local agent policies but does not hold for arbitrary MMDPs. In this paper, we propose an approximate transition dependence property, called -transition dependence and develop a metric for quantifying how far an MMDP deviates from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
