Egocentric 6-DoF Tracking of Small Handheld Objects
Rohit Pandey, Pavel Pidlypenskyi, Shuoran Yang, Christine Kaeser-Chen

TL;DR
This paper presents a real-time, mobile CPU-compatible method for 6-DoF tracking of handheld controllers using egocentric stereo images, leveraging a new dataset and a specialized deep learning model.
Contribution
It introduces the SSD-AF-Stereo3D model and the HMD Controller dataset for efficient 6-DoF tracking from egocentric views, combining deep learning with IMU data.
Findings
Achieves 33.5 mm mean average error in 3D keypoint prediction.
Operates in real-time on mobile CPU hardware.
Demonstrates effective 6-DoF tracking with stereo images and IMU integration.
Abstract
Virtual and augmented reality technologies have seen significant growth in the past few years. A key component of such systems is the ability to track the pose of head mounted displays and controllers in 3D space. We tackle the problem of efficient 6-DoF tracking of a handheld controller from egocentric camera perspectives. We collected the HMD Controller dataset which consist of over 540,000 stereo image pairs labelled with the full 6-DoF pose of the handheld controller. Our proposed SSD-AF-Stereo3D model achieves a mean average error of 33.5 millimeters in 3D keypoint prediction and is used in conjunction with an IMU sensor on the controller to enable 6-DoF tracking. We also present results on approaches for model based full 6-DoF tracking. All our models operate under the strict constraints of real time mobile CPU inference.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Robotics and Sensor-Based Localization · Human Pose and Action Recognition
