Decentralized Deterministic Multi-Agent Reinforcement Learning
Antoine Grosnit, Desmond Cai, Laura Wynter

TL;DR
This paper extends decentralized multi-agent reinforcement learning algorithms to deterministic policies in continuous action spaces, providing convergence guarantees and addressing exploration challenges.
Contribution
It introduces a provably-convergent decentralized actor-critic algorithm for deterministic policies in continuous spaces, expanding MARL applicability.
Findings
Convergence guarantees for the new algorithm.
Effective handling of deterministic policies in MARL.
Applicability to high-dimensional action spaces.
Abstract
[Zhang, ICML 2018] provided the first decentralized actor-critic algorithm for multi-agent reinforcement learning (MARL) that offers convergence guarantees. In that work, policies are stochastic and are defined on finite action spaces. We extend those results to offer a provably-convergent decentralized actor-critic algorithm for learning deterministic policies on continuous action spaces. Deterministic policies are important in real-world settings. To handle the lack of exploration inherent in deterministic policies, we consider both off-policy and on-policy settings. We provide the expression of a local deterministic policy gradient, decentralized deterministic actor-critic algorithms and convergence guarantees for linearly-approximated value functions. This work will help enable decentralized MARL in high-dimensional action spaces and pave the way for more widespread use of MARL.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Adaptive Dynamic Programming Control
