Bilevel MCTS for Amortized O(1) Node Selection in Classical Planning
Masataro Asai

TL;DR
This paper introduces a bilevel MCTS approach with Tree Collapsing that achieves amortized O(1) node selection time in classical planning, significantly improving efficiency over traditional methods.
Contribution
It presents a novel bilevel modification to MCTS and a Tree Collapsing technique to reduce node selection complexity in classical planning.
Findings
Achieves amortized O(1) node selection runtime.
Reduces action selection steps with Tree Collapsing.
Improves planning efficiency in large-depth problems.
Abstract
We study an efficient implementation of Multi-Armed Bandit (MAB)-based Monte-Carlo Tree Search (MCTS) for classical planning. One weakness of MCTS is that it spends a significant time deciding which node to expand next. While selecting a node from an OPEN list with nodes has runtime complexity with traditional array-based priority-queues for dense integer keys, the tree-based OPEN list used by MCTS requires , which roughly corresponds to the search depth . In classical planning, is arbitrarily large (e.g., in -disk Tower-of-Hanoi) and the runtime for node selection is significant, unlike in game tree search, where the cost is negligible compared to the node evaluation (rollouts) because is inherently limited by the game (e.g., in Go). To improve this bottleneck, we propose a bilevel modification to MCTS that runs a best-first…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsArtificial Intelligence in Games · AI-based Problem Solving and Planning · Advanced Bandit Algorithms Research
