Reinforcement Learning on Human Decision Models for Uniquely   Collaborative AI Teammates

Nicholas Kantack

arXiv:2111.09800·cs.AI·November 19, 2021

Reinforcement Learning on Human Decision Models for Uniquely Collaborative AI Teammates

Nicholas Kantack

PDF

Open Access 1 Repo

TL;DR

This paper presents a reinforcement learning approach to develop AI agents that model human decision-making in Hanabi, achieving superior collaboration with humans and discovering human-complementary strategies for better teamwork.

Contribution

The study introduces a novel method of training AI agents via behavioral cloning of human decisions, leading to human-compatible agents that outperform previous benchmarks in collaborative gameplay.

Findings

01

Achieved a human-play average score of 16.5 in Hanabi.

02

Discovered human-complementary play styles through modeling and exploration.

03

Demonstrated the effectiveness of behavioral cloning for collaborative AI agents.

Abstract

In 2021 the Johns Hopkins University Applied Physics Laboratory held an internal challenge to develop artificially intelligent (AI) agents that could excel at the collaborative card game Hanabi. Agents were evaluated on their ability to play with human players whom the agents had never previously encountered. This study details the development of the agent that won the challenge by achieving a human-play average score of 16.5, outperforming the current state-of-the-art for human-bot Hanabi scores. The winning agent's development consisted of observing and accurately modeling the author's decision making in Hanabi, then training with a behavioral clone of the author. Notably, the agent discovered a human-complementary play style by first mimicking human decision making, then exploring variations to the human-like strategy that led to higher simulated human-bot scores. This work examines…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

nickkantack/Hanabi-Cyclone
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Data Stream Mining Techniques · Artificial Intelligence in Games