UCT-ADP Progressive Bias Algorithm for Solving Gomoku
Xu Cao, Yanghao Lin

TL;DR
This paper introduces a novel Gomoku AI that combines ADP, UCT, heuristic-based pruning, and neural network evaluations to improve search efficiency and convergence speed over traditional UCT methods.
Contribution
It presents a new framework integrating ADP with UCT and heuristic pruning, enhancing Gomoku game tree search and convergence speed.
Findings
Faster convergence to correct values than standard UCT.
Effective elimination of search depth defects.
Potential applicability to other Gomoku-like games.
Abstract
We combine Adaptive Dynamic Programming (ADP), a reinforcement learning method and UCB applied to trees (UCT) algorithm with a more powerful heuristic function based on Progressive Bias method and two pruning strategies for a traditional board game Gomoku. For the Adaptive Dynamic Programming part, we train a shallow forward neural network to give a quick evaluation of Gomoku board situations. UCT is a general approach in MCTS as a tree policy. Our framework use UCT to balance the exploration and exploitation of Gomoku game trees while we also apply powerful pruning strategies and heuristic function to re-select the available 2-adjacent grids of the state and use ADP instead of simulation to give estimated values of expanded nodes. Experiment result shows that this method can eliminate the search depth defect of the simulation process and converge to the correct value faster than single…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Games · Reinforcement Learning in Robotics · Metaheuristic Optimization Algorithms Research
MethodsPruning
