Learning to Stop: Dynamic Simulation Monte-Carlo Tree Search

Li-Cheng Lan; Meng-Yu Tsai; Ti-Rong Wu; I-Chen Wu; Cho-Jui Hsieh

arXiv:2012.07910·cs.AI·December 16, 2020·1 cites

Learning to Stop: Dynamic Simulation Monte-Carlo Tree Search

Li-Cheng Lan, Meng-Yu Tsai, Ti-Rong Wu, I-Chen Wu, Cho-Jui Hsieh

PDF

Open Access 1 Video

TL;DR

This paper introduces DS-MCTS, a method that predicts uncertainty during Monte Carlo tree search to dynamically stop simulations, significantly speeding up decision-making in Go-like games without sacrificing performance.

Contribution

The paper presents a novel uncertainty prediction approach for dynamic stopping in MCTS, reducing computational resources while maintaining competitive results.

Findings

01

DS-MCTS speeds up a NoGo agent 2.5 times faster.

02

Maintains similar winning rate with fewer simulations.

03

Achieves 61% win rate against the original program.

Abstract

Monte Carlo tree search (MCTS) has achieved state-of-the-art results in many domains such as Go and Atari games when combining with deep neural networks (DNNs). When more simulations are executed, MCTS can achieve higher performance but also requires enormous amounts of CPU and GPU resources. However, not all states require a long searching time to identify the best action that the agent can find. For example, in 19x19 Go and NoGo, we found that for more than half of the states, the best action predicted by DNN remains unchanged even after searching 2 minutes. This implies that a significant amount of resources can be saved if we are able to stop the searching earlier when we are confident with the current searching result. In this paper, we propose to achieve this goal by predicting the uncertainty of the current searching status and use the result to decide whether we should stop…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Learning to Stop: Dynamic Simulation Monte-Carlo Tree Search· underline

Taxonomy

TopicsArtificial Intelligence in Games · Reinforcement Learning in Robotics · Sports Analytics and Performance

MethodsAlphaZero