TL;DR
This paper introduces a novel deep learning-based method for estimating camera-to-robot pose from a single RGB image, enabling online calibration without prior offline setup, and demonstrates comparable accuracy to traditional methods.
Contribution
The approach uses simulated data and domain randomization to train a neural network for keypoint detection, allowing single-image pose estimation without offline calibration.
Findings
Achieves accuracy comparable to classic calibration with a single image
Improves accuracy with additional frames from static poses
Works across multiple robot types and camera sensors
Abstract
We present an approach for estimating the pose of an external camera with respect to a robot using a single RGB image of the robot. The image is processed by a deep neural network to detect 2D projections of keypoints (such as joints) associated with the robot. The network is trained entirely on simulated data using domain randomization to bridge the reality gap. Perspective-n-point (PnP) is then used to recover the camera extrinsics, assuming that the camera intrinsics and joint configuration of the robot manipulator are known. Unlike classic hand-eye calibration systems, our method does not require an off-line calibration step. Rather, it is capable of computing the camera extrinsics from a single frame, thus opening the possibility of on-line calibration. We show experimental results for three different robots and camera sensors, demonstrating that our approach is able to achieve…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
