Mastering NIM and Impartial Games with Weak Neural Networks: An AlphaZero-inspired Multi-Frame Approach
S{\o}ren Riis

TL;DR
This paper investigates how neural network architectures inspired by AlphaZero can master impartial games like NIM under fixed-latency, quantized inference constraints, highlighting the importance of structural priors for effective learning.
Contribution
It introduces multi-policy-head and multi-frame architectures that overcome representational barriers in the FSQI/AC0 regime, enabling near-perfect game mastery.
Findings
Single-head models perform near chance levels.
Two-frame models achieve near-perfect accuracy.
Multi-head FSM models reach perfect classification.
Abstract
We study impartial games under fixed-latency, fixed-scale quantised inference (FSQI). In this fixed-scale, bounded-range regime, we prove that inference is simulable by constant-depth polynomial-size Boolean circuits (AC0). This yields a worst-case representational barrier: single-frame agents in the FSQI/AC0 regime cannot strongly master NIM, because optimal play depends on the global nim-sum (parity). Under our stylised deterministic rollout interface, a single rollout policy head from the structured family analysed here reveals only one fixed linear functional of the invariant, so increasing rollout budget alone does not recover the missing bits. We derive two structural bypasses: (1) a multi-policy-head rollout architecture that recovers the full invariant via distinct rollout channels, and (2) a multi-frame architecture that tracks local nimber differences and supports restoration.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Games
MethodsAlphaZero
