Yahtzee: Reinforcement Learning Techniques for Stochastic Combinatorial Games

Nicholas A. Pape

arXiv:2601.00007·cs.LG·January 5, 2026

Yahtzee: Reinforcement Learning Techniques for Stochastic Combinatorial Games

Nicholas A. Pape

PDF

Open Access

TL;DR

This paper explores reinforcement learning methods for the game Yahtzee, formulating it as an MDP and evaluating policy gradient algorithms, with A2C showing robustness and near-optimal performance.

Contribution

The study applies and compares various policy gradient RL algorithms to Yahtzee, highlighting A2C's robustness and analyzing the challenges in learning optimal strategies.

Findings

01

A2C trains robustly across different settings.

02

The agent achieves within 5% of the optimal score.

03

Models struggle with learning the upper bonus strategy.

Abstract

Yahtzee is a classic dice game with a stochastic, combinatorial structure and delayed rewards, making it an interesting mid-scale RL benchmark. While an optimal policy for solitaire Yahtzee can be computed using dynamic programming methods, multiplayer is intractable, motivating approximation methods. We formulate Yahtzee as a Markov Decision Process (MDP), and train self-play agents using various policy gradient methods: REINFORCE, Advantage Actor-Critic (A2C), and Proximal Policy Optimization (PPO), all using a multi-headed network with a shared trunk. We ablate feature and action encodings, architecture, return estimators, and entropy regularization to understand their impact on learning. Under a fixed training budget, REINFORCE and PPO prove sensitive to hyperparameters and fail to reach near-optimal performance, whereas A2C trains robustly across a range of settings. Our agent…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Games · Reinforcement Learning in Robotics · Educational Games and Gamification