Game Reasoning Arena: A Framework and Benchmark for Assessing Reasoning Capabilities of Large Language Models via Game Play

Lucia Cipolina-Kun; Marianna Nezhurina; Jenia Jitsev

arXiv:2508.03368·cs.AI·August 19, 2025

Game Reasoning Arena: A Framework and Benchmark for Assessing Reasoning Capabilities of Large Language Models via Game Play

Lucia Cipolina-Kun, Marianna Nezhurina, Jenia Jitsev

PDF

Open Access

TL;DR

The paper introduces Game Reasoning Arena, a comprehensive framework for evaluating large language models' reasoning abilities through strategic game play, enabling systematic comparisons across various agent types and game scenarios.

Contribution

It presents a new framework and benchmark for assessing LLM reasoning via game play, integrating multiple agent types and supporting scalable, distributed evaluation.

Findings

01

Enables systematic comparison of LLMs and other agents in strategic games.

02

Supports diverse game scenarios and agent types for comprehensive evaluation.

03

Facilitates empirical analysis of LLM reasoning and game-theoretic behavior.

Abstract

The Game Reasoning Arena library provides a framework for evaluating the decision making abilities of large language models (LLMs) through strategic board games implemented in Google OpenSpiel library. The framework enables systematic comparisons between LLM based agents and other agents (random, heuristic, reinforcement learning agents, etc.) in various game scenarios by wrapping multiple board and matrix games and supporting different agent types. It integrates API access to models via liteLLM, local model deployment via vLLM, and offers distributed execution through Ray. This paper summarises the library structure, key characteristics, and motivation of the repository, highlighting how it contributes to the empirical evaluation of the reasoning of LLM and game theoretic behaviour.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling