UmeTrack: Unified multi-view end-to-end hand tracking for VR

Shangchen Han; Po-chen Wu; Yubo Zhang; Beibei Liu; Linguang Zhang,; Zheng Wang; Weiguang Si; Peizhao Zhang; Yujun Cai; Tomas Hodan; Randi; Cabezas; Luan Tran; Muzaffer Akbay; Tsz-Ho Yu; Cem Keskin; Robert Wang

arXiv:2211.00099·cs.CV·November 2, 2022

UmeTrack: Unified multi-view end-to-end hand tracking for VR

Shangchen Han, Po-chen Wu, Yubo Zhang, Beibei Liu, Linguang Zhang,, Zheng Wang, Weiguang Si, Peizhao Zhang, Yujun Cai, Tomas Hodan, Randi, Cabezas, Luan Tran, Muzaffer Akbay, Tsz-Ho Yu, Cem Keskin, Robert Wang

PDF

TL;DR

UmeTrack introduces a unified, end-to-end differentiable framework for real-time multi-view 3D hand tracking in VR, directly predicting world-space hand poses and enhancing VR interaction accuracy.

Contribution

The paper presents a novel end-to-end multi-view hand tracking model that predicts 3D hand poses in world space, addressing limitations of previous methods and including a new large-scale egocentric dataset.

Findings

01

System effectively handles challenging interactive motions.

02

Successfully applied to real-time VR applications.

03

Outperforms existing methods in accuracy and robustness.

Abstract

Real-time tracking of 3D hand pose in world space is a challenging problem and plays an important role in VR interaction. Existing work in this space are limited to either producing root-relative (versus world space) 3D pose or rely on multiple stages such as generating heatmaps and kinematic optimization to obtain 3D pose. Moreover, the typical VR scenario, which involves multi-view tracking from wide \ac{fov} cameras is seldom addressed by these methods. In this paper, we present a unified end-to-end differentiable framework for multi-view, multi-frame hand tracking that directly predicts 3D hand pose in world space. We demonstrate the benefits of end-to-end differentiabilty by extending our framework with downstream tasks such as jitter reduction and pinch prediction. To demonstrate the efficacy of our model, we further present a new large-scale egocentric hand pose dataset that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.