An Efficient Algorithm for Thresholding Monte Carlo Tree Search
Shoma Nameki (1), Atsuyoshi Nakamura (2), Junpei Komiyama (3, 4), Koji Tabata (5) ((1) Graduate School of Information Science, Technology, Hokkaido University, (2) Faculty of Information Science, Technology, Hokkaido University

TL;DR
This paper presents a new efficient algorithm for the Thresholding Monte Carlo Tree Search problem, improving sample complexity and computational cost through a novel ratio-based modification of existing strategies.
Contribution
The paper introduces a $ ext{delta}$-correct sequential sampling algorithm with asymptotic optimality and a modified D-Tracking strategy that enhances empirical performance and reduces computational complexity.
Findings
Asymptotically optimal sample complexity achieved.
Significant empirical improvements over previous methods.
Reduced per-round computational cost from linear to logarithmic.
Abstract
We introduce the Thresholding Monte Carlo Tree Search problem, in which, given a tree and a threshold , a player must answer whether the root node value of is at least or not. In the given tree, `MAX' or `MIN' is labeled on each internal node, and the value of a `MAX'-labeled (`MIN'-labeled) internal node is the maximum (minimum) of its child values. The value of a leaf node is the mean reward of an unknown distribution, from which the player can sample rewards. For this problem, we develop a -correct sequential sampling algorithm based on the Track-and-Stop strategy that has asymptotically optimal sample complexity. We show that a ratio-based modification of the D-Tracking arm-pulling strategy leads to a substantial improvement in empirical sample complexity, as well as reducing the per-round computational cost from linear to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Games · Optimization and Search Problems · Advanced Bandit Algorithms Research
