Glove2Hand: Synthesizing Natural Hand-Object Interaction from Multi-Modal Sensing Gloves

Xinyu Zhang; Ziyi Kou; Chuan Qin; Mia Huang; Ergys Ristani; Ankit Kumar; Lele Chen; Kun He; Abdeslam Boularias; Li Guan

arXiv:2603.20850·cs.CV·March 24, 2026

Glove2Hand: Synthesizing Natural Hand-Object Interaction from Multi-Modal Sensing Gloves

Xinyu Zhang, Ziyi Kou, Chuan Qin, Mia Huang, Ergys Ristani, Ankit Kumar, Lele Chen, Kun He, Abdeslam Boularias, Li Guan

PDF

Open Access

TL;DR

Glove2Hand is a framework that converts multi-modal sensing glove videos into photorealistic hand images, preserving physical interaction details, and creating a new dataset to improve hand-object interaction understanding in vision and robotics.

Contribution

We introduce Glove2Hand, a novel method for translating multi-modal glove data into realistic hand images and present HandSense, a new multi-modal HOI dataset with tactile and IMU signals.

Findings

01

Enhanced hand tracking under occlusion

02

Improved contact estimation accuracy

03

Realistic hand-object interaction synthesis

Abstract

Understanding hand-object interaction (HOI) is fundamental to computer vision, robotics, and AR/VR. However, conventional hand videos often lack essential physical information such as contact forces and motion signals, and are prone to frequent occlusions. To address the challenges, we present Glove2Hand, a framework that translates multi-modal sensing glove HOI videos into photorealistic bare hands, while faithfully preserving the underlying physical interaction dynamics. We introduce a novel 3D Gaussian hand model that ensures temporal rendering consistency. The rendered hand is seamlessly integrated into the scene using a diffusion-based hand restorer, which effectively handles complex hand-object interactions and non-rigid deformations. Leveraging Glove2Hand, we create HandSense, the first multi-modal HOI dataset featuring glove-to-hand videos with synchronized tactile and IMU…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHand Gesture Recognition Systems · Human Pose and Action Recognition · Robot Manipulation and Learning