Impartial Games: A Challenge for Reinforcement Learning

Bei Zhou; S{\o}ren Riis

arXiv:2205.12787·cs.LG·January 22, 2026·1 cites

Impartial Games: A Challenge for Reinforcement Learning

Bei Zhou, S{\o}ren Riis

PDF

Open Access 1 Repo

TL;DR

This paper demonstrates that AlphaZero-style reinforcement learning algorithms face fundamental challenges in mastering impartial games like Nim, due to their inability to learn abstract mathematical principles such as parity, especially as game complexity increases.

Contribution

The paper introduces a new framework to evaluate RL agents in impartial games and reveals inherent representational limitations of neural networks in learning abstract functions like parity.

Findings

01

AlphaZero-style agents succeed on small Nim boards but struggle as size increases.

02

Neural networks have difficulty learning non-associative functions like parity.

03

Current RL algorithms cannot effectively master impartial games beyond rote memorization.

Abstract

AlphaZero-style reinforcement learning (RL) algorithms have achieved superhuman performance in many complex board games such as Chess, Shogi, and Go. However, we showcase that these algorithms encounter significant and fundamental challenges when applied to impartial games, a class where players share game pieces and optimal strategy often relies on abstract mathematical principles. Specifically, we utilise the game of Nim as a concrete and illustrative case study to reveal critical limitations of AlphaZero-style and similar self-play RL algorithms. We introduce a novel conceptual framework distinguishing between champion and expert mastery to evaluate RL agent performance. Our findings reveal that while AlphaZero-style agents can achieve champion-level play on very small Nim boards, their learning progression severely degrades as the board size increases. This difficulty stems not…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sagebei/impartial-games-a-chanllenge-for-reinforcement-learning
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Sports Analytics and Performance · Explainable Artificial Intelligence (XAI)

Methods*Communicated@Fast*How Do I Communicate to Expedia? · Residual Connection · Batch Normalization · Residual Block · Prioritized Experience Replay · Convolution · Average Pooling · Monte-Carlo Tree Search · MuZero · AlphaZero