UCT-ADP Progressive Bias Algorithm for Solving Gomoku

Xu Cao; Yanghao Lin

arXiv:1912.05407·cs.AI·December 12, 2019

UCT-ADP Progressive Bias Algorithm for Solving Gomoku

Xu Cao, Yanghao Lin

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel Gomoku AI that combines ADP, UCT, heuristic-based pruning, and neural network evaluations to improve search efficiency and convergence speed over traditional UCT methods.

Contribution

It presents a new framework integrating ADP with UCT and heuristic pruning, enhancing Gomoku game tree search and convergence speed.

Findings

01

Faster convergence to correct values than standard UCT.

02

Effective elimination of search depth defects.

03

Potential applicability to other Gomoku-like games.

Abstract

We combine Adaptive Dynamic Programming (ADP), a reinforcement learning method and UCB applied to trees (UCT) algorithm with a more powerful heuristic function based on Progressive Bias method and two pruning strategies for a traditional board game Gomoku. For the Adaptive Dynamic Programming part, we train a shallow forward neural network to give a quick evaluation of Gomoku board situations. UCT is a general approach in MCTS as a tree policy. Our framework use UCT to balance the exploration and exploitation of Gomoku game trees while we also apply powerful pruning strategies and heuristic function to re-select the available 2-adjacent grids of the state and use ADP instead of simulation to give estimated values of expanded nodes. Experiment result shows that this method can eliminate the search depth defect of the simulation process and converge to the correct value faster than single…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

IrohXu/Gomoku-XYH19
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Games · Reinforcement Learning in Robotics · Metaheuristic Optimization Algorithms Research

MethodsPruning