Scalable Hierarchical Reinforcement Learning for Hyper Scale Multi-Robot Task Planning

Xuan Zhou; Xiang Shi; Lele Zhang; Chen Chen; Hongbo Li; Lin Ma; Fang Deng; Jie Chen

arXiv:2412.19538·cs.RO·May 6, 2026

Scalable Hierarchical Reinforcement Learning for Hyper Scale Multi-Robot Task Planning

Xuan Zhou, Xiang Shi, Lele Zhang, Chen Chen, Hongbo Li, Lin Ma, Fang Deng, Jie Chen

PDF

TL;DR

This paper introduces a scalable hierarchical reinforcement learning framework for multi-robot task planning in large-scale warehouse systems, achieving high performance and generalization in complex, dynamic environments.

Contribution

It develops a multi-stage HRL-based planner with a hierarchical temporal attention network and curricula, enhancing scalability and generalization for hyper scale multi-robot task planning.

Findings

01

Outperforms state-of-the-art methods in simulated and real-world RMFS.

02

Successfully scales to 200 robots and 1000 racks on unseen maps.

03

Maintains high performance across various unlearned environments.

Abstract

To improve the efficiency of warehousing system and meet huge customer orders, we aim to solve the challenges of dimension disaster and dynamic properties in hyper scale multi-robot task planning (MRTP) for robotic mobile fulfillment system (RMFS). Existing research indicates that hierarchical reinforcement learning (HRL) is an effective method to reduce these challenges. Based on that, we construct an efficient multi-stage HRL-based multi-robot task planner for hyper scale MRTP in RMFS, and the planning process is represented with a special temporal graph topology. To ensure optimality, the planner is designed with a centralized architecture, but it also brings the challenges of scaling up and generalization that require policies to maintain performance for various unlearned scales and maps. To tackle these difficulties, we first construct a hierarchical temporal attention network…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.