Counterfactual Multi-Agent Reinforcement Learning with Graph Convolution Communication
Jianyu Su, Stephen Adams, and Peter A. Beling

TL;DR
This paper introduces a novel multi-agent reinforcement learning architecture that combines graph convolution-based communication with a credit assignment mechanism, improving cooperation and interpretability in various systems.
Contribution
It develops a flexible architecture integrating communication via graph convolution with counterfactual credit assignment, enhancing multi-agent learning performance and interpretability.
Findings
Outperforms state-of-the-art methods including COMA.
Enables application to dynamic and static multi-agent systems.
Provides interpretable communication strategies.
Abstract
We consider a fully cooperative multi-agent system where agents cooperate to maximize a system's utility in a partial-observable environment. We propose that multi-agent systems must have the ability to (1) communicate and understand the inter-plays between agents and (2) correctly distribute rewards based on an individual agent's contribution. In contrast, most work in this setting considers only one of the above abilities. In this study, we develop an architecture that allows for communication among agents and tailors the system's reward for each individual agent. Our architecture represents agent communication through graph convolution and applies an existing credit assignment structure, counterfactual multi-agent policy gradient (COMA), to assist agents to learn communication by back-propagation. The flexibility of the graph structure enables our method to be applicable to a variety…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Advanced Graph Neural Networks · Smart Grid Energy Management
MethodsInterpretability · Convolution
