Decentralized Monte Carlo Tree Search for Partially Observable Multi-agent Pathfinding
Alexey Skrynnik, Anton Andreychuk, Konstantin Yakovlev, Aleksandr, Panov

TL;DR
This paper introduces a decentralized Monte Carlo Tree Search method for multi-agent pathfinding in partially observable environments, outperforming existing learnable solutions by leveraging local observations and neural MCTS.
Contribution
It presents a novel decentralized MCTS approach inspired by AlphaZero for lifelong MAPF with partial observability and limited communication.
Findings
Outperforms state-of-the-art learnable MAPF solvers
Effective in lifelong MAPF scenarios with local observations
Utilizes neural MCTS tailored for multi-agent planning
Abstract
The Multi-Agent Pathfinding (MAPF) problem involves finding a set of conflict-free paths for a group of agents confined to a graph. In typical MAPF scenarios, the graph and the agents' starting and ending vertices are known beforehand, allowing the use of centralized planning algorithms. However, in this study, we focus on the decentralized MAPF setting, where the agents may observe the other agents only locally and are restricted in communications with each other. Specifically, we investigate the lifelong variant of MAPF, where new goals are continually assigned to the agents upon completion of previous ones. Drawing inspiration from the successful AlphaZero approach, we propose a decentralized multi-agent Monte Carlo Tree Search (MCTS) method for MAPF tasks. Our approach utilizes the agent's observations to recreate the intrinsic Markov decision process, which is then used for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotic Path Planning Algorithms · Multimodal Machine Learning Applications · Multi-Agent Systems and Negotiation
MethodsSparse Evolutionary Training · Focus · AlphaZero
