HandCept: A Visual-Inertial Fusion Framework for Accurate Proprioception in Dexterous Hands

Junda Huang; Jianshu Zhou; Honghao Guo; Yunhui Liu

arXiv:2505.08213·cs.RO·May 14, 2025

HandCept: A Visual-Inertial Fusion Framework for Accurate Proprioception in Dexterous Hands

Junda Huang, Jianshu Zhou, Honghao Guo, Yunhui Liu

PDF

TL;DR

HandCept introduces a real-time visual-inertial fusion framework using a wrist-mounted camera and IMUs, achieving accurate, drift-free joint angle estimation in dexterous robotic hands, enhancing manipulation capabilities.

Contribution

The paper presents a novel zero-shot learning visual-inertial fusion framework with a latency-free EKF for accurate proprioception in dexterous hands, including a high-fidelity rendering pipeline for sim-to-real transfer.

Findings

01

Achieves joint angle errors between 2° and 4° without drift.

02

Outperforms visual-only and inertial-only methods.

03

Provides a stable, calibrated IMU system with a common base frame.

Abstract

As robotics progresses toward general manipulation, dexterous hands are becoming increasingly critical. However, proprioception in dexterous hands remains a bottleneck due to limitations in volume and generality. In this work, we present HandCept, a novel visual-inertial proprioception framework designed to overcome the challenges of traditional joint angle estimation methods. HandCept addresses the difficulty of achieving accurate and robust joint angle estimation in dynamic environments where both visual and inertial measurements are prone to noise and drift. It leverages a zero-shot learning approach using a wrist-mounted RGB-D camera and 9-axis IMUs, fused in real time via a latency-free Extended Kalman Filter (EKF). Our results show that HandCept achieves joint angle estimation errors between $2^{\circ}$ and $4^{\circ}$ without observable drift, outperforming visual-only and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.