Sim-to-Real Reinforcement Learning for Vision-Based Dexterous Manipulation on Humanoids

Toru Lin; Kartik Sachdev; Linxi Fan; Jitendra Malik; Yuke Zhu

arXiv:2502.20396·cs.RO·September 3, 2025

Sim-to-Real Reinforcement Learning for Vision-Based Dexterous Manipulation on Humanoids

Toru Lin, Kartik Sachdev, Linxi Fan, Jitendra Malik, Yuke Zhu

PDF

TL;DR

This paper presents a practical sim-to-real reinforcement learning approach enabling humanoid robots to perform complex vision-based dexterous manipulation tasks with high success and robustness, demonstrating scalability and real-world applicability.

Contribution

The paper introduces a novel sim-to-real RL framework with automated tuning, generalized reward, policy distillation, and hybrid object representation for humanoid manipulation.

Findings

01

High success rates on unseen objects

02

Robust and adaptive manipulation behaviors

03

Scalable to real-world humanoid tasks

Abstract

Learning generalizable robot manipulation policies, especially for complex multi-fingered humanoids, remains a significant challenge. Existing approaches primarily rely on extensive data collection and imitation learning, which are expensive, labor-intensive, and difficult to scale. Sim-to-real reinforcement learning (RL) offers a promising alternative, but has mostly succeeded in simpler state-based or single-hand setups. How to effectively extend this to vision-based, contact-rich bimanual manipulation tasks remains an open question. In this paper, we introduce a practical sim-to-real RL recipe that trains a humanoid robot to perform three challenging dexterous manipulation tasks: grasp-and-reach, box lift and bimanual handover. Our method features an automated real-to-sim tuning module, a generalized reward formulation based on contact and object goals, a divide-and-conquer policy…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.