Parallel Monte Carlo Tree Search with Batched Rigid-body Simulations for Speeding up Long-Horizon Episodic Robot Planning
Baichuan Huang, Abdeslam Boularias, Jingjin Yu

TL;DR
This paper introduces PMBS, a GPU-accelerated parallel Monte Carlo tree search algorithm with batched simulations, significantly speeding up long-horizon robotic planning tasks while maintaining solution quality.
Contribution
The paper presents a novel GPU-based parallel MCTS algorithm with batched simulations, enabling faster and more accurate long-horizon robotic planning.
Findings
Achieves over 30x speedup compared to serial MCTS.
Improves solution quality in object retrieval tasks.
Demonstrates effective real-robot application with negligible sim-to-real gap.
Abstract
We propose a novel Parallel Monte Carlo tree search with Batched Simulations (PMBS) algorithm for accelerating long-horizon, episodic robotic planning tasks. Monte Carlo tree search (MCTS) is an effective heuristic search algorithm for solving episodic decision-making problems whose underlying search spaces are expansive. Leveraging a GPU-based large-scale simulator, PMBS introduces massive parallelism into MCTS for solving planning tasks through the batched execution of a large number of concurrent simulations, which allows for more efficient and accurate evaluations of the expected cost-to-go over large action spaces. When applied to the challenging manipulation tasks of object retrieval from clutter, PMBS achieves a speedup of over with an improved solution quality, in comparison to a serial MCTS implementation. We show that PMBS can be directly applied to real robot…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotic Path Planning Algorithms · Artificial Intelligence in Games · Reinforcement Learning in Robotics
