A hierarchical spatial-aware algorithm with efficient reinforcement learning for human-robot task planning and allocation in production

Jintao Xue; Xiao Li; and Nianmin Zhang

arXiv:2604.12669·cs.AI·April 17, 2026

A hierarchical spatial-aware algorithm with efficient reinforcement learning for human-robot task planning and allocation in production

Jintao Xue, Xiao Li, and Nianmin Zhang

PDF

TL;DR

This paper introduces a hierarchical spatial-aware algorithm with reinforcement learning for efficient human-robot task planning and allocation in complex manufacturing environments, improving performance and reducing training time.

Contribution

It proposes a novel hierarchical TPA framework combining a buffer-based deep Q-learning method and a spatially aware path planning approach for dynamic production settings.

Findings

01

The EBQ method reduces training time and improves performance in long-term reward scenarios.

02

The SAP method effectively allocates tasks considering spatial information in real-time.

03

Experiments demonstrate the method's effectiveness in complex 3D production simulations.

Abstract

In advanced manufacturing systems, humans and robots collaborate to conduct the production process. Effective task planning and allocation (TPA) is crucial for achieving high production efficiency, yet it remains challenging in complex and dynamic manufacturing environments. The dynamic nature of humans and robots, particularly the need to consider spatial information (e.g., humans' real-time position and the distance they need to move to complete a task), substantially complicates TPA. To address the above challenges, we decompose production tasks into manageable subtasks. We then implement a real-time hierarchical human-robot TPA algorithm, including a high-level agent for task planning and a low-level agent for task allocation. For the high-level agent, we propose an efficient buffer-based deep Q-learning method (EBQ), which reduces training time and enhances performance in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.