PRIMAL2: Pathfinding via Reinforcement and Imitation Multi-Agent Learning -- Lifelong
Mehul Damani, Zhiyao Luo, Emerson Wenzel, Guillaume, Sartoretti

TL;DR
PRIMAL2 is a decentralized reinforcement learning framework that enables scalable, real-time multi-agent pathfinding in complex, structured environments, significantly improving coordination and reactivity in lifelong MAPF scenarios.
Contribution
It extends previous work to highly structured environments by developing new behaviors and training methods for decentralized policies in lifelong MAPF.
Findings
PRIMAL2 outperforms previous methods in dense environments.
It scales to 2048 agents in real-time.
Achieves comparable performance to state-of-the-art planners.
Abstract
Multi-agent path finding (MAPF) is an indispensable component of large-scale robot deployments in numerous domains ranging from airport management to warehouse automation. In particular, this work addresses lifelong MAPF (LMAPF) - an online variant of the problem where agents are immediately assigned a new goal upon reaching their current one - in dense and highly structured environments, typical of real-world warehouse operations. Effectively solving LMAPF in such environments requires expensive coordination between agents as well as frequent replanning abilities, a daunting task for existing coupled and decoupled approaches alike. With the purpose of achieving considerable agent coordination without any compromise on reactivity and scalability, we introduce PRIMAL2, a distributed reinforcement learning framework for LMAPF where agents learn fully decentralized policies to reactively…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
