In-Hand Object Pose Estimation via Visual-Tactile Fusion
Felix Nonnengie{\ss}er, Alap Kshirsagar, Boris Belousov, Jan Peters

TL;DR
This paper introduces a method that fuses visual and tactile data to improve in-hand object pose estimation for robotic manipulation, especially under occlusion, achieving higher accuracy than vision-only methods.
Contribution
It presents a novel sensor fusion approach combining RGB-D and tactile sensors with an adapted ICP algorithm for improved pose estimation accuracy.
Findings
Tactile data significantly improves pose accuracy under occlusion.
Achieved average pose error of 7.5 mm and 16.7 degrees.
Outperformed vision-only baselines by up to 20%.
Abstract
Accurate in-hand pose estimation is crucial for robotic object manipulation, but visual occlusion remains a major challenge for vision-based approaches. This paper presents an approach to robotic in-hand object pose estimation, combining visual and tactile information to accurately determine the position and orientation of objects grasped by a robotic hand. We address the challenge of visual occlusion by fusing visual information from a wrist-mounted RGB-D camera with tactile information from vision-based tactile sensors mounted on the fingertips of a robotic gripper. Our approach employs a weighting and sensor fusion module to combine point clouds from heterogeneous sensor types and control each modality's contribution to the pose estimation process. We use an augmented Iterative Closest Point (ICP) algorithm adapted for weighted point clouds to estimate the 6D object pose. Our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Advanced Sensor and Energy Harvesting Materials · Soft Robotics and Applications
