A Unified Framework for Locality in Scalable MARL
Sourav Chakraborty, Amit Kiran Rege, Claire Monteleoni, Lijun Chen

TL;DR
This paper introduces a unified, policy-dependent framework for understanding locality in scalable multi-agent reinforcement learning, improving upon prior conservative bounds by revealing a fundamental tradeoff and providing spectral conditions for exponential decay.
Contribution
It presents a novel decomposition of the policy-induced interdependence matrix, establishing a tighter spectral condition for locality that accounts for policy smoothness and environment coupling.
Findings
Spectral condition $ ho(E^{ ext{s}}+E^{ ext{a}} ext{Pi}( ext{pi}))<1$ guarantees exponential decay.
Policy smoothness can induce locality even in strongly coupled environments.
Framework enables provably-sound localized policy improvement with spectral guarantees.
Abstract
Scalable Multi-Agent Reinforcement Learning (MARL) is fundamentally challenged by the curse of dimensionality. A common solution is to exploit locality, which hinges on an Exponential Decay Property (EDP) of the value function. However, existing conditions that guarantee the EDP are often conservative, as they are based on worst-case, environment-only bounds (e.g., supremums over actions) and fail to capture the regularizing effect of the policy itself. In this work, we establish that locality can also be a \emph{policy-dependent} phenomenon. Our central contribution is a novel decomposition of the policy-induced interdependence matrix, , which decouples the environment's sensitivity to state () and action () from the policy's sensitivity to state (). This decomposition reveals that locality can be induced by a smooth policy (small…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Adaptive Dynamic Programming Control
