HEX and Neurodynamic Programming

Debangshu Banerjee

arXiv:2008.06359·cs.LG·August 17, 2020

HEX and Neurodynamic Programming

Debangshu Banerjee

PDF

Open Access

TL;DR

This paper introduces a novel approach to solving the game of Hex using reinforcement learning and neural networks, avoiding traditional game tree methods and heuristics, inspired by AlphaGo Zero's success.

Contribution

It presents a new method for Hex that bypasses game tree structures and heuristics, relying solely on reinforcement learning and neural network approximations.

Findings

01

Successful application of reinforcement learning to Hex

02

Avoidance of traditional game tree and heuristic methods

03

Neural networks effectively approximate game states

Abstract

Hex is a complex game with a high branching factor. For the first time Hex is being attempted to be solved without the use of game tree structures and associated methods of pruning. We also are abstaining from any heuristic information about Virtual Connections or Semi Virtual Connections which were previously used in all previous known computer versions of the game. The H-search algorithm which was the basis of finding such connections and had been used with success in previous Hex playing agents has been forgone. Instead what we use is reinforcement learning through self play and approximations through neural networks to by pass the problem of high branching factor and maintaining large tables for state-action evaluations. Our code is based primarily on NeuroHex. The inspiration is drawn from the recent success of AlphaGo Zero.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Games · Reinforcement Learning in Robotics · Evolutionary Algorithms and Applications