AlphaSnake: Policy Iteration on a Nondeterministic NP-hard Markov   Decision Process

Kevin Du; Ian Gemp; Yi Wu; Yingying Wu

arXiv:2211.09622·cs.AI·November 18, 2022

AlphaSnake: Policy Iteration on a Nondeterministic NP-hard Markov Decision Process

Kevin Du, Ian Gemp, Yi Wu, Yingying Wu

PDF

Open Access

TL;DR

This paper introduces AlphaSnake, an algorithm using Monte Carlo Tree Search inspired by AlphaZero, to learn optimal policies for the NP-hard Snake game, achieving a win rate over 50%.

Contribution

It demonstrates the application of policy iteration via MCTS to a complex NP-hard problem modeled as a stochastic MDP, specifically the Snake game.

Findings

01

Achieved a win rate over 0.5 in Snake.

02

First to demonstrate AlphaZero's effectiveness on NP-hard environments.

03

Surpassed previous algorithms in game performance.

Abstract

Reinforcement learning has recently been used to approach well-known NP-hard combinatorial problems in graph theory. Among these problems, Hamiltonian cycle problems are exceptionally difficult to analyze, even when restricted to individual instances of structurally complex graphs. In this paper, we use Monte Carlo Tree Search (MCTS), the search algorithm behind many state-of-the-art reinforcement learning algorithms such as AlphaZero, to create autonomous agents that learn to play the game of Snake, a game centered on properties of Hamiltonian cycles on grid graphs. The game of Snake can be formulated as a single-player discounted Markov Decision Process (MDP) where the agent must behave optimally in a stochastic environment. Determining the optimal policy for Snake, defined as the policy that maximizes the probability of winning - or win rate - with higher priority and minimizes the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Artificial Intelligence in Games · Digital Games and Media

MethodsAlphaZero