AnyHand: A Large-Scale Synthetic Dataset for RGB(-D) Hand Pose Estimation
Chen Si, Yulin Liu, Bo Ai, Jianwen Xie, Rolandos Alexandros Potamias, Chuanxia Zheng, Hao Su

TL;DR
AnyHand is a large synthetic dataset with 2.5M RGB and 4.1M RGB-D images, designed to improve 3D hand pose estimation by providing diverse, occlusion-rich data with geometric annotations.
Contribution
The paper introduces a comprehensive synthetic dataset, AnyHand, that enhances hand pose estimation models and demonstrates improved performance and generalization across multiple benchmarks.
Findings
Training with AnyHand improves accuracy on FreiHAND and HO-3D benchmarks.
Models trained with AnyHand generalize better to out-of-domain datasets.
A lightweight depth fusion module enhances RGB-D model performance.
Abstract
We present AnyHand, a large-scale synthetic dataset designed to advance the state of the art in 3D hand pose estimation from both RGB-only and RGB-D inputs. While recent works with foundation approaches have shown that an increase in the quantity and diversity of training data can markedly improve performance and robustness in hand pose estimation, existing real-world-collected datasets on this task are limited in coverage, and prior synthetic datasets rarely provide occlusions, arm details, and aligned depth together at scale. To address this bottleneck, our AnyHand contains 2.5M single-hand and 4.1M hand-object interaction RGB-D images, with rich geometric annotations. In the RGB-only setting, we show that extending the original training sets of existing baselines with AnyHand yields significant gains on multiple benchmarks (FreiHAND and HO-3D), even when keeping the architecture and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
