HO-Cap: A Capture System and Dataset for 3D Reconstruction and Pose   Tracking of Hand-Object Interaction

Jikai Wang; Qifan Zhang; Yu-Wei Chao; Bowen Wen; Xiaohu Guo; Yu Xiang

arXiv:2406.06843·cs.CV·March 12, 2025

HO-Cap: A Capture System and Dataset for 3D Reconstruction and Pose Tracking of Hand-Object Interaction

Jikai Wang, Qifan Zhang, Yu-Wei Chao, Bowen Wen, Xiaohu Guo, Yu Xiang

PDF

Open Access 1 Repo 1 Video

TL;DR

HO-Cap introduces a cost-effective data capture system and dataset for 3D reconstruction and pose tracking of hands and objects in videos, facilitating research in embodied AI and robotics.

Contribution

The paper presents a novel multi-camera data capture system and semi-automatic annotation method for 3D hand-object interaction datasets, reducing costs and annotation time.

Findings

01

Captured diverse hand-object interaction videos

02

Developed semi-automatic annotation method

03

Dataset available for community use

Abstract

We introduce a data capture system and a new dataset, HO-Cap, for 3D reconstruction and pose tracking of hands and objects in videos. The system leverages multiple RGBD cameras and a HoloLens headset for data collection, avoiding the use of expensive 3D scanners or mocap systems. We propose a semi-automatic method for annotating the shape and pose of hands and objects in the collected videos, significantly reducing the annotation time compared to manual labeling. With this system, we captured a video dataset of humans interacting with objects to perform various tasks, including simple pick-and-place actions, handovers between hands, and using objects according to their affordance, which can serve as human demonstrations for research in embodied AI and robot manipulation. Our data capture setup and annotation framework will be available for the community to use in reconstructing 3D…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

IRVLUTD/HO-Cap
pytorchOfficial

Videos

HO-Cap: A Capture System and Dataset for 3D Reconstruction and Pose Tracking of Hand-Object Interaction· slideslive

Taxonomy

TopicsHand Gesture Recognition Systems · Human Pose and Action Recognition · Human Motion and Animation