Ad-Hoc Human-AI Coordination Challenge

Tin Dizdarevi\'c; Ravi Hammond; Tobias Gessler; Anisoara Calinescu; Jonathan Cook; Matteo Gallici; Andrei Lupu; Darius Muglich; Johannes Forkel; Jakob Nicolaus Foerster

arXiv:2506.21490·cs.AI·July 1, 2025

Ad-Hoc Human-AI Coordination Challenge

Tin Dizdarevi\'c, Ravi Hammond, Tobias Gessler, Anisoara Calinescu, Jonathan Cook, Matteo Gallici, Andrei Lupu, Darius Muglich, Johannes Forkel, Jakob Nicolaus Foerster

PDF

Open Access 1 Repo

TL;DR

This paper introduces the AH2AC2 challenge for human-AI coordination using Hanabi, developing human proxy agents for scalable evaluation, and providing baseline results with open datasets and controlled evaluation systems.

Contribution

It presents a novel benchmark for human-AI coordination in Hanabi, with human proxy agents and a large dataset to facilitate research and reproducibility.

Findings

01

Developed human proxy agents for Hanabi coordination evaluation.

02

Open-sourced a dataset of 3,079 games for research.

03

Provided baseline results for two- and three-player scenarios.

Abstract

Achieving seamless coordination between AI agents and humans is crucial for real-world applications, yet it remains a significant open challenge. Hanabi is a cooperative card game featuring imperfect information, constrained communication, theory of mind requirements, and coordinated action -- making it an ideal testbed for human-AI coordination. However, its use for human-AI interaction has been limited by the challenges of human evaluation. In this work, we introduce the Ad-Hoc Human-AI Coordination Challenge (AH2AC2) to overcome the constraints of costly and difficult-to-reproduce human evaluations. We develop \textit{human proxy agents} on a large-scale human dataset that serve as robust, cheap, and reproducible human-like evaluation partners in AH2AC2. To encourage the development of data-efficient methods, we open-source a dataset of 3,079 games, deliberately limiting the amount…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

flairox/ah2ac2
jaxOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Artificial Intelligence in Games · Multimodal Machine Learning Applications