Introducing HOT3D: An Egocentric Dataset for 3D Hand and Object Tracking
Prithviraj Banerjee, Sindi Shkodrani, Pierre Moulon, Shreyas Hampali,, Fan Zhang, Jade Fountain, Edward Miller, Selen Basol, Richard Newcombe,, Robert Wang, Jakob Julian Engel, Tomas Hodan

TL;DR
HOT3D is a comprehensive egocentric dataset with multi-view RGB and multimodal data capturing hand-object interactions, designed to advance research in 3D hand and object tracking.
Contribution
The paper introduces HOT3D, a large-scale, multimodal dataset with detailed annotations for egocentric 3D hand and object tracking, including diverse scenarios and professional ground-truth poses.
Findings
Provides over 3.7 million images with detailed 3D annotations
Includes diverse scenarios like kitchen, office, and living room interactions
Facilitates public challenges to accelerate research in egocentric hand-object tracking
Abstract
We introduce HOT3D, a publicly available dataset for egocentric hand and object tracking in 3D. The dataset offers over 833 minutes (more than 3.7M images) of multi-view RGB/monochrome image streams showing 19 subjects interacting with 33 diverse rigid objects, multi-modal signals such as eye gaze or scene point clouds, as well as comprehensive ground truth annotations including 3D poses of objects, hands, and cameras, and 3D models of hands and objects. In addition to simple pick-up/observe/put-down actions, HOT3D contains scenarios resembling typical actions in a kitchen, office, and living room environment. The dataset is recorded by two head-mounted devices from Meta: Project Aria, a research prototype of light-weight AR/AI glasses, and Quest 3, a production VR headset sold in millions of units. Ground-truth poses were obtained by a professional motion-capture system using small…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHand Gesture Recognition Systems
