DM$^2$: Decentralized Multi-Agent Reinforcement Learning for Distribution Matching
Caroline Wang, Ishan Durugkar, Elad Liebman, Peter Stone

TL;DR
This paper introduces DM$^2$, a decentralized multi-agent reinforcement learning approach that enables agents to coordinate by independently minimizing distribution mismatch, eliminating the need for centralized control or explicit communication.
Contribution
The paper proposes a novel distribution matching framework for decentralized multi-agent learning, with theoretical guarantees and a practical algorithm that leverages expert demonstrations.
Findings
Agents can converge to joint policies without centralized control.
Combining task and distribution matching rewards improves performance.
Experimental results show superior performance over naive baselines.
Abstract
Current approaches to multi-agent cooperation rely heavily on centralized mechanisms or explicit communication protocols to ensure convergence. This paper studies the problem of distributed multi-agent learning without resorting to centralized components or explicit communication. It examines the use of distribution matching to facilitate the coordination of independent agents. In the proposed scheme, each agent independently minimizes the distribution mismatch to the corresponding component of a target visitation distribution. The theoretical analysis shows that under certain conditions, each agent minimizing its individual distribution mismatch allows the convergence to the joint policy that generated the target distribution. Further, if the target distribution is from a joint policy that optimizes a cooperative task, the optimal policy for a combination of this task reward and the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Evolutionary Game Theory and Cooperation · Experimental Behavioral Economics Studies
