LUDOBENCH: Evaluating LLM Behavioural Decision-Making Through Spot-Based Board Game Scenarios in Ludo

Ojas Jain; Dhruv Kumar

arXiv:2604.05681·cs.AI·April 8, 2026

LUDOBENCH: Evaluating LLM Behavioural Decision-Making Through Spot-Based Board Game Scenarios in Ludo

Ojas Jain, Dhruv Kumar

PDF

1 Repo

TL;DR

LudoBench is a benchmark for assessing large language models' strategic reasoning in the complex, stochastic game of Ludo, using handcrafted scenarios and a functional simulator to analyze model behaviors and vulnerabilities.

Contribution

Introduces LudoBench, a comprehensive benchmark with scenarios and a simulator to evaluate LLM strategic reasoning in Ludo, highlighting behavioral archetypes and prompt sensitivity issues.

Findings

01

Models agree with game-theory baseline only 40-46% of the time.

02

Models fall into archetypes: finishers and builders, each capturing only half of the strategy.

03

Behavioral shifts occur under history-conditioned grudge framing, indicating prompt sensitivity.

Abstract

We introduce LudoBench, a benchmark for evaluating LLM strategic reasoning in Ludo, a stochastic multi-agent board game whose dice mechanics, piece capture, safe-square navigation, and home-path progression introduce meaningful planning complexity. LudoBench comprises 480 handcrafted spot scenarios across 12 behaviorally distinct decision categories, each isolating a specific strategic choice. We additionally contribute a fully functional 4-player Ludo simulator supporting Random, Heuristic, Game-Theory, and LLM agents. The game-theory agent uses Expectiminimax search with depth-limited lookahead to provide a principled strategic ceiling beyond greedy heuristics. Evaluating six models spanning four model families, we find that all models agree with the game-theory baseline only 40-46% of the time. Models split into distinct behavioral archetypes: finishers that complete pieces but…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

https://anonymous.4open.science/r/LudoBench-5CBF
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.