REACH: Hand Pose Estimation from Room Corners
Shu Nakamura, Ryo Kawahara, Genki Kinoshita, Ryosuke Hirai, Yasutomo Kawanishi, Shohei Nobuhara, Ko Nishino

TL;DR
This paper presents REACH-Net, a Transformer-based model for accurate 3D hand pose estimation from low-resolution, occluded views in room corners, leveraging hand-body coordination and a new large-scale dataset.
Contribution
The introduction of a novel Transformer-based model and the REACH dataset for accurate 3D hand pose estimation from afar in challenging room environments.
Findings
REACH-Net outperforms existing methods in accuracy.
The REACH dataset includes diverse daily activities of 50 participants.
The approach enables 'in-the-wild' human behavior analysis.
Abstract
We introduce a novel 3D hand pose estimator that can accurately recover the shape and pose of people's hands in a room from afar, typically from fixed cameras at room corners, in extremely low-resolution and frequently occluded views. Our key idea is to fully leverage hand-body coordination, its temporal progression, and multiview observations. We achieve this with a novel Transformer-based model, in which hand and body configurations are modeled through correlations between their visual features expressed as per-view tokens, and their temporal coordination is exploited in an autoregressive manner. We introduce a novel dataset, which we refer to as REACH, Room-Environment dataset Annotated with Chest cameras for Hand pose estimation, to train and test our method. REACH is a first-of-its-kind large-scale hand pose dataset that captures accurate hand movements of 50 participants across a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
