Neurohex: A Deep Q-learning Hex Agent

Kenny Young; Ryan Hayward; Gautham Vasan

arXiv:1604.07097·cs.AI·April 27, 2016

Neurohex: A Deep Q-learning Hex Agent

Kenny Young, Ryan Hayward, Gautham Vasan

PDF

TL;DR

This paper introduces NeuroHex, a deep Q-learning agent with an 11-layer CNN that learns to play Hex at a competitive level without search, demonstrating promising results against a top Hex program.

Contribution

NeuroHex is the first deep Q-learning Hex agent trained on a large state space, achieving competitive performance without search.

Findings

01

NeuroHex achieved 20.4% win rate as first player against MoHex.

02

NeuroHex trained over two weeks with no search.

03

Potential for further improvement with more training.

Abstract

DeepMind's recent spectacular success in using deep convolutional neural nets and machine learning to build superhuman level agents --- e.g. for Atari games via deep Q-learning and for the game of Go via Reinforcement Learning --- raises many questions, including to what extent these methods will succeed in other domains. In this paper we consider DQL for the game of Hex: after supervised initialization, we use selfplay to train NeuroHex, an 11-layer CNN that plays Hex on the 13x13 board. Hex is the classic two-player alternate-turn stone placement game played on a rhombus of hexagonal cells in which the winner is whomever connects their two opposing sides. Despite the large action and state space, our system trains a Q-network capable of strong play with no search. After two weeks of Q-learning, NeuroHex achieves win-rates of 20.4% as first player and 2.1% as second player against a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsQ-Learning