TL;DR
This paper introduces a novel method to enhance multi-agent cooperation in Hanabi by augmenting the action space with conventions, inspired by human strategies, leading to improved performance in self-play and cross-play scenarios.
Contribution
It proposes a new approach that incorporates human-like conventions into agent action spaces, improving cooperation efficiency in Hanabi without complex architectures.
Findings
Significant performance improvements in Hanabi self-play scenarios.
Enhanced cross-play capabilities among different agent strategies.
Reduction in training complexity compared to previous methods.
Abstract
The card game Hanabi is considered a strong medium for the testing and development of multi-agent reinforcement learning (MARL) algorithms, due to its cooperative nature, partial observability, limited communication and remarkable complexity. Previous research efforts have explored the capabilities of MARL algorithms within Hanabi, focusing largely on advanced architecture design and algorithmic manipulations to achieve state-of-the-art performance for various number of cooperators. However, this often leads to complex solution strategies with high computational cost and requiring large amounts of training data. For humans to solve the Hanabi game effectively, they require the use of conventions, which often allows for a means to implicitly convey ideas or knowledge based on a predefined, and mutually agreed upon, set of "rules" or principles. Multi-agent problems containing partial…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSparse Evolutionary Training · OPT
