Adapting Improved Upper Confidence Bounds for Monte-Carlo Tree Search

Yun-Ching Liu; Yoshimasa Tsuruoka

arXiv:1505.02830·cs.AI·May 13, 2015

Adapting Improved Upper Confidence Bounds for Monte-Carlo Tree Search

Yun-Ching Liu, Yoshimasa Tsuruoka

PDF

Open Access

TL;DR

This paper introduces Mi-UCT, a modified version of the improved UCB algorithm tailored for Monte-Carlo Tree Search, demonstrating superior performance in small-sample scenarios on Go and NoGo games.

Contribution

The paper proposes modifications to the improved UCB algorithm for better integration with MCTS, resulting in the Mi-UCT algorithm that outperforms standard UCT in specific game settings.

Findings

01

Mi-UCT outperforms UCT with limited playouts.

02

Mi-UCT performs comparably to UCT with more playouts.

03

Modified UCB is more suitable for game tree search.

Abstract

The UCT algorithm, which combines the UCB algorithm and Monte-Carlo Tree Search (MCTS), is currently the most widely used variant of MCTS. Recently, a number of investigations into applying other bandit algorithms to MCTS have produced interesting results. In this research, we will investigate the possibility of combining the improved UCB algorithm, proposed by Auer et al. (2010), with MCTS. However, various characteristics and properties of the improved UCB algorithm may not be ideal for a direct application to MCTS. Therefore, some modifications were made to the improved UCB algorithm, making it more suitable for the task of game tree search. The Mi-UCT algorithm is the application of the modified UCB algorithm applied to trees. The performance of Mi-UCT is demonstrated on the games of $9 \times 9$ Go and $9 \times 9$ NoGo, and has shown to outperform the plain UCT algorithm when only a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Games · Advanced Bandit Algorithms Research · Sports Analytics and Performance