PRIMAL: Pathfinding via Reinforcement and Imitation Multi-Agent Learning
Guillaume Sartoretti, Justin Kerr, Yunfei Shi, Glenn Wagner, T. K., Satish Kumar, Sven Koenig, and Howie Choset

TL;DR
PRIMAL introduces a decentralized reinforcement and imitation learning framework for multi-agent pathfinding, enabling scalable, online, reactive planning suitable for large-scale real-world robot deployments.
Contribution
It develops a novel decentralized learning approach combining reinforcement and imitation learning, allowing scalable, online MAPF without centralized planning.
Findings
Successfully scaled to 1024 agents in randomized worlds.
Achieved higher success rates than state-of-the-art MAPF planners.
Validated in a hybrid simulation with real and simulated robots.
Abstract
Multi-agent path finding (MAPF) is an essential component of many large-scale, real-world robot deployments, from aerial swarms to warehouse automation. However, despite the community's continued efforts, most state-of-the-art MAPF planners still rely on centralized planning and scale poorly past a few hundred agents. Such planning approaches are maladapted to real-world deployments, where noise and uncertainty often require paths be recomputed online, which is impossible when planning times are in seconds to minutes. We present PRIMAL, a novel framework for MAPF that combines reinforcement and imitation learning to teach fully-decentralized policies, where agents reactively plan paths online in a partially-observable world while exhibiting implicit coordination. This framework extends our previous work on distributed learning of collaborative policies by introducing demonstrations of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
