Credit-cognisant reinforcement learning for multi-agent cooperation
F. Bredell, H. A. Engelbrecht, J. C. Schoeman

TL;DR
This paper introduces credit-cognisant rewards (CCRs) in multi-agent reinforcement learning to improve cooperation and performance in partially observable environments, demonstrated through a card game benchmark.
Contribution
The paper proposes CCRs to enhance credit assignment in MARL, outperforming traditional independent and recurrent Q-learning methods.
Findings
CCRs significantly improve MARL performance.
Enhanced credit assignment leads to better cooperation.
Results demonstrated on a simplified Hanabi game.
Abstract
Traditional multi-agent reinforcement learning (MARL) algorithms, such as independent Q-learning, struggle when presented with partially observable scenarios, and where agents are required to develop delicate action sequences. This is often the result of the reward for a good action only being available after other agents have taken theirs, and these actions are not credited accordingly. Recurrent neural networks have proven to be a viable solution strategy for solving these types of problems, resulting in significant performance increase when compared to other methods. In this paper, we explore a different approach and focus on the experiences used to update the action-value functions of each agent. We introduce the concept of credit-cognisant rewards (CCRs), which allows an agent to perceive the effect its actions had on the environment as well as on its co-agents. We show that by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSports Analytics and Performance · Reinforcement Learning in Robotics · Evolutionary Algorithms and Applications
MethodsTest · Q-Learning
