Towards Game-Playing AI Benchmarks via Performance Reporting Standards

Vanessa Volz; Boris Naujoks

arXiv:2007.02742·cs.AI·July 7, 2020

Towards Game-Playing AI Benchmarks via Performance Reporting Standards

Vanessa Volz, Boris Naujoks

PDF

TL;DR

This paper proposes standardized reporting guidelines for AI game-playing performance to enable unbiased comparisons and facilitate the development of benchmarks and competitions.

Contribution

It introduces a novel framework for performance reporting standards in AI game-playing research, addressing the lack of comparability across studies.

Findings

01

Guidelines improve clarity and comparability of AI performance reports

02

Facilitates the creation of benchmarks and competitions

03

Supports more general conclusions about AI strengths and challenges

Abstract

While games have been used extensively as milestones to evaluate game-playing AI, there exists no standardised framework for reporting the obtained observations. As a result, it remains difficult to draw general conclusions about the strengths and weaknesses of different game-playing AI algorithms. In this paper, we propose reporting guidelines for AI game-playing performance that, if followed, provide information suitable for unbiased comparisons between different AI approaches. The vision we describe is to build benchmarks and competitions based on such guidelines in order to be able to draw more general conclusions about the behaviour of different AI algorithms, as well as the types of challenges different games pose.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.