RDE: A Hybrid Policy Framework for Multi-Agent Path Finding Problem
Jianqi Gao, Yanjie Li, Xiaoqing Yang, Mingshan Tan

TL;DR
This paper introduces RDE, a hybrid policy framework for multi-agent path finding that combines reinforcement learning, distance heat maps, and escape strategies to improve navigation efficiency in structured environments.
Contribution
The paper proposes a novel hybrid MAPF policy, RDE, that switches among RL, DHM, and escape policies to address deadlocks and enhance multi-agent navigation.
Findings
RDE significantly improves performance of RL-based MAPF policies.
Simulations on warehouse-like maps demonstrate better deadlock avoidance.
Hybrid approach outperforms pure RL policies in structured environments.
Abstract
Multi-agent path finding (MAPF) is an abstract model for the navigation of multiple robots in warehouse automation, where multiple robots plan collision-free paths from the start to goal positions. Reinforcement learning (RL) has been employed to develop partially observable distributed MAPF policies that can be scaled to any number of agents. However, RL-based MAPF policies often get agents stuck in deadlock due to warehouse automation's dense and structured obstacles. This paper proposes a novel hybrid MAPF policy, RDE, based on switching among the RL-based MAPF policy, the Distance heat map (DHM)-based policy and the Escape policy. The RL-based policy is used for coordination among agents. In contrast, when no other agents are in the agent's field of view, it can get the next action by querying the DHM. The escape policy that randomly selects valid actions can help agents escape the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotic Path Planning Algorithms · Reinforcement Learning in Robotics · Optimization and Search Problems
