Multi-View Video-Based 3D Hand Pose Estimation

Leyla Khaleghi; Alireza Sepas Moghaddam; Joshua Marshall; Ali Etemad

arXiv:2109.11747·cs.CV·September 27, 2021

Multi-View Video-Based 3D Hand Pose Estimation

Leyla Khaleghi, Alireza Sepas Moghaddam, Joshua Marshall, Ali Etemad

PDF

Open Access 1 Repo

TL;DR

This paper introduces MuViHand, a large multi-view video dataset with ground-truth 3D hand poses, and MuViHandNet, a neural network pipeline that leverages multi-view and temporal data for improved 3D hand pose estimation.

Contribution

The paper presents a new multi-view video dataset with synthetic data and complex scenarios, along with a novel neural network architecture that effectively utilizes multi-view and temporal information for 3D hand pose estimation.

Findings

01

MuViHand dataset contains over 402,000 synthetic images from 6 angles.

02

MuViHandNet outperforms baseline methods on the new dataset.

03

Temporal and multi-view information significantly improve estimation accuracy.

Abstract

Hand pose estimation (HPE) can be used for a variety of human-computer interaction applications such as gesture-based control for physical or virtual/augmented reality devices. Recent works have shown that videos or multi-view images carry rich information regarding the hand, allowing for the development of more robust HPE systems. In this paper, we present the Multi-View Video-Based 3D Hand (MuViHand) dataset, consisting of multi-view videos of the hand along with ground-truth 3D pose labels. Our dataset includes more than 402,000 synthetic hand images available in 4,560 videos. The videos have been simultaneously captured from six different angles with complex backgrounds and random levels of dynamic lighting. The data has been captured from 10 distinct animated subjects using 12 cameras in a semi-circle topology where six tracking cameras only focus on the hand and the other six…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

leylakhaleghi/muvihand
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Hand Gesture Recognition Systems · Stroke Rehabilitation and Recovery

Methods*Communicated@Fast*How Do I Communicate to Expedia? · Convolution · Max Pooling · Concatenated Skip Connection · U-Net