TL;DR
MADDPG-K is a scalable multi-agent reinforcement learning method that limits critic inputs to the k nearest agents, reducing computational complexity while maintaining or improving performance.
Contribution
Introduces MADDPG-K, a scalable extension that restricts critic inputs to nearby agents, enabling efficient training in large multi-agent systems.
Findings
MADDPG-K achieves competitive or better performance than MADDPG.
It converges faster in cooperative environments.
It scales better with increasing number of agents.
Abstract
We propose MADDPG-K, a scalable extension to Multi-Agent Deep Deterministic Policy Gradient (MADDPG) that addresses the computational limitations of centralized critic approaches. Centralized critics, which condition on the observations and actions of all agents, have demonstrated significant performance gains in cooperative and competitive multi-agent settings. However, their critic networks grow linearly in input size with the number of agents, making them increasingly expensive to train at scale. MADDPG-K mitigates this by restricting each agent's critic to the closest agents under a chosen metric which in our case is Euclidean distance. This ensures a constant-size critic input regardless of the total agent count. We analyze the complexity of this approach, showing that the quadratic cost it retains arises from cheap scalar distance computations rather than the expensive neural…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
