Learning Reactive Dexterous Grasping via Hierarchical Task-Space RL Planning and Joint-Space QP Control

Ho Jae Lee; Yonghyeon Lee; Alexander Alexiev; Tzu-Yuan Lin; Se Hwan Jeon; and Sangbae Kim

arXiv:2605.03363·cs.RO·May 6, 2026

Learning Reactive Dexterous Grasping via Hierarchical Task-Space RL Planning and Joint-Space QP Control

Ho Jae Lee, Yonghyeon Lee, Alexander Alexiev, Tzu-Yuan Lin, Se Hwan Jeon, and Sangbae Kim

PDF

TL;DR

This paper introduces a hierarchical control framework combining reinforcement learning and quadratic programming for reactive dexterous grasping, enabling safe, flexible, and zero-shot transferable manipulation in real-world scenarios.

Contribution

It presents a novel multi-agent RL architecture coupled with a GPU-accelerated QP controller for decoupling high-level planning from low-level execution, improving safety and adaptability.

Findings

01

Achieved robust zero-shot transfer of grasping skills from simulation to real hardware.

02

Enabled dynamic obstacle avoidance and safety margin adjustments without retraining.

03

Demonstrated reactive recovery from physical disturbances in unstructured environments.

Abstract

In this work, we propose a hybrid hierarchical control framework for reactive dexterous grasping that explicitly decouples high-level spatial intent from low-level joint execution. We introduce a multi-agent reinforcement learning architecture, specialized into distinct arm and hand agents, that acts as a high-level planner by generating desired task-space velocity commands. These commands are then processed by a GPU-parallelized quadratic programming controller, which translates them into feasible joint velocities while strictly enforcing kinematic limits and collision avoidance. This structural isolation not only accelerates training convergence but also strictly enforces hardware safety. Furthermore, the architecture unlocks zero-shot steerability, allowing system operators to dynamically adjust safety margins and avoid dynamic obstacles without retraining the policy. We extensively…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.