Accelerating Monte-Carlo Tree Search with Optimized Posterior Policies
Keith Frankston, Benjamin Howard

TL;DR
This paper presents RMCTS, a recursive Monte-Carlo tree search algorithm that leverages optimized posterior policies to significantly accelerate search speed while maintaining comparable quality to traditional methods.
Contribution
Introduces RMCTS, a recursive, batch-friendly MCTS variant that uses optimized posterior policies to achieve over 40x speedup in single-state searches.
Findings
RMCTS is over 40 times faster than MCTS-UCB for single root state searches.
RMCTS achieves about 3 times speedup when searching multiple root states.
Networks trained with RMCTS match the quality of those trained with MCTS-UCB in one-third of the training time.
Abstract
We introduce a recursive AlphaZero-style Monte--Carlo tree search algorithm, "RMCTS". The advantage of RMCTS over AlphaZero's MCTS-UCB is speed. In RMCTS, the search tree is explored in a breadth-first manner, so that network inferences naturally occur in large batches. This significantly reduces the GPU latency cost. We find that RMCTS is often more than 40 times faster than MCTS-UCB when searching a single root state, and about 3 times faster when searching a large batch of root states. The recursion in RMCTS is based on computing optimized posterior policies at each game state in the search tree, starting from the leaves and working back up to the root. Here we use the posterior policy explored in "Monte--Carlo tree search as regularized policy optimization" (Grill, et al.) Their posterior policy is the unique policy which maximizes the expected reward given estimated action…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Games · Reinforcement Learning in Robotics · Sports Analytics and Performance
