Concept Learning for Cooperative Multi-Agent Reinforcement Learning
Zhonghan Ge, Yuanyang Zhu, Chunlin Chen

TL;DR
This paper introduces CMQ, a novel interpretable value decomposition method for multi-agent reinforcement learning that enhances transparency and performance by learning human-like cooperation concepts.
Contribution
It proposes a new concept learning framework for MARL that improves interpretability without sacrificing performance, using supervised cooperation concepts in value-based learning.
Findings
CMQ outperforms state-of-the-art methods on StarCraft II and LBF tasks.
CMQ provides meaningful cooperation mode representations.
Supports test-time concept interventions for bias detection.
Abstract
Despite substantial progress in applying neural networks (NN) to multi-agent reinforcement learning (MARL) areas, they still largely suffer from a lack of transparency and interoperability. However, its implicit cooperative mechanism is not yet fully understood due to black-box networks. In this work, we study an interpretable value decomposition framework via concept bottleneck models, which promote trustworthiness by conditioning credit assignment on an intermediate level of human-like cooperation concepts. To address this problem, we propose a novel value-based method, named Concepts learning for Multi-agent Q-learning (CMQ), that goes beyond the current performance-vs-interpretability trade-off by learning interpretable cooperation concepts. CMQ represents each cooperation concept as a supervised vector, as opposed to existing models where the information flowing through their…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
