Visuo-Acoustic Hand Pose and Contact Estimation

Yuemin Mao; Uksang Yoo; Yunchao Yao; Shahram Najam Syed; Luca Bondi; Jonathan Francis; Jean Oh; Jeffrey Ichnowski

arXiv:2508.00852·cs.HC·August 25, 2025

Visuo-Acoustic Hand Pose and Contact Estimation

Yuemin Mao, Uksang Yoo, Yunchao Yao, Shahram Najam Syed, Luca Bondi, Jonathan Francis, Jean Oh, Jeffrey Ichnowski

PDF

Open Access

TL;DR

VibeMesh is a wearable visuo-acoustic system that combines vision and active acoustic sensing with a graph neural network to accurately estimate hand pose and contact points, especially under occlusion.

Contribution

It introduces a novel, non-intrusive visuo-acoustic platform and a cross-modal graph network for dense hand pose and contact estimation, along with a new dataset.

Findings

01

Outperforms vision-only methods in accuracy

02

Robust in occluded and static-contact scenarios

03

Provides dense, high-resolution contact predictions

Abstract

Accurately estimating hand pose and hand-object contact events is essential for robot data-collection, immersive virtual environments, and biomechanical analysis, yet remains challenging due to visual occlusion, subtle contact cues, limitations in vision-only sensing, and the lack of accessible and flexible tactile sensing. We therefore introduce VibeMesh, a novel wearable system that fuses vision with active acoustic sensing for dense, per-vertex hand contact and pose estimation. VibeMesh integrates a bone-conduction speaker and sparse piezoelectric microphones, distributed on a human hand, emitting structured acoustic signals and capturing their propagation to infer changes induced by contact. To interpret these cross-modal signals, we propose a graph-based attention network that processes synchronized audio spectra and RGB-D-derived hand meshes to predict contact with high spatial…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHand Gesture Recognition Systems · Human Pose and Action Recognition · Robot Manipulation and Learning