Dual Monte Carlo Tree Search

Prashank Kadam; Ruiyang Xu; Karl Lieberherr

arXiv:2103.11517·cs.AI·October 12, 2021·1 cites

Dual Monte Carlo Tree Search

Prashank Kadam, Ruiyang Xu, Karl Lieberherr

PDF

Open Access

TL;DR

This paper introduces Dual MCTS, a neural Monte Carlo Tree Search algorithm that reduces computational requirements and converges faster than AlphaZero by using two search trees and a novel update technique.

Contribution

The paper proposes Dual MCTS, a new neural MCTS algorithm that improves efficiency and performance over AlphaZero through innovative tree update methods.

Findings

01

Dual MCTS outperforms AlphaZero in various games.

02

The new update technique reduces the number of tree updates.

03

Dual MCTS converges faster with less computational power.

Abstract

AlphaZero, using a combination of Deep Neural Networks and Monte Carlo Tree Search (MCTS), has successfully trained reinforcement learning agents in a tabula-rasa way. The neural MCTS algorithm has been successful in finding near-optimal strategies for games through self-play. However, the AlphaZero algorithm has a significant drawback; it takes a long time to converge and requires high computational power due to complex neural networks for solving games like Chess, Go, Shogi, etc. Owing to this, it is very difficult to pursue neural MCTS research without cutting-edge hardware, which is a roadblock for many aspiring neural MCTS researchers. In this paper, we propose a new neural MCTS algorithm, called Dual MCTS, which helps overcome these drawbacks. Dual MCTS uses two different search trees, a single deep neural network, and a new update technique for the search trees using a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Games · Sports Analytics and Performance · Reinforcement Learning in Robotics

MethodsAlphaZero