3D Foundation Models Enable Simultaneous Geometry and Pose Estimation of Grasped Objects
Weiming Zhi, Haozhan Tang, Tianyi Zhang, Matthew, Johnson-Roberson

TL;DR
This paper introduces a method using 3D foundation models to jointly estimate the geometry and pose of grasped objects from RGB images, enabling robots to interact more effectively with objects without camera calibration.
Contribution
The approach leverages large pre-trained 3D models and a coordinate-alignment technique to accurately estimate object geometry and pose in the robot's frame, without requiring camera extrinsic calibration.
Findings
Effective geometry and pose estimation on real-world objects
No need for camera calibration or extrinsic parameters
Enables robot manipulation based on object coordinates
Abstract
Humans have the remarkable ability to use held objects as tools to interact with their environment. For this to occur, humans internally estimate how hand movements affect the object's movement. We wish to endow robots with this capability. We contribute methodology to jointly estimate the geometry and pose of objects grasped by a robot, from RGB images captured by an external camera. Notably, our method transforms the estimated geometry into the robot's coordinate frame, while not requiring the extrinsic parameters of the external camera to be calibrated. Our approach leverages 3D foundation models, large models pre-trained on huge datasets for 3D vision tasks, to produce initial estimates of the in-hand object. These initial estimations do not have physically correct scales and are in the camera's frame. Then, we formulate, and efficiently solve, a coordinate-alignment problem to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Robotic Mechanisms and Dynamics · Advanced Numerical Analysis Techniques
MethodsSparse Evolutionary Training
