CtRNet-X: Camera-to-Robot Pose Estimation in Real-world Conditions Using   a Single Camera

Jingpei Lu; Zekai Liang; Tristin Xie; Florian Ritcher; Shan Lin,; Sainan Liu; Michael C. Yip

arXiv:2409.10441·cs.RO·September 17, 2024

CtRNet-X: Camera-to-Robot Pose Estimation in Real-world Conditions Using a Single Camera

Jingpei Lu, Zekai Liang, Tristin Xie, Florian Ritcher, Shan Lin,, Sainan Liu, Michael C. Yip

PDF

Open Access

TL;DR

This paper introduces CtRNet-X, a novel framework for camera-to-robot pose estimation that remains accurate even when only parts of the robot are visible, using vision-language models and keypoint detection.

Contribution

It presents a new method combining vision-language models with keypoint-based pose estimation to handle partial robot views in real-world scenarios.

Findings

01

Effective in partial-view conditions

02

Robust across diverse datasets

03

Outperforms existing methods in real-world scenarios

Abstract

Camera-to-robot calibration is crucial for vision-based robot control and requires effort to make it accurate. Recent advancements in markerless pose estimation methods have eliminated the need for time-consuming physical setups for camera-to-robot calibration. While the existing markerless pose estimation methods have demonstrated impressive accuracy without the need for cumbersome setups, they rely on the assumption that all the robot joints are visible within the camera's field of view. However, in practice, robots usually move in and out of view, and some portion of the robot may stay out-of-frame during the whole manipulation task due to real-world constraints, leading to a lack of sufficient visual features and subsequent failure of these approaches. To address this challenge and enhance the applicability to vision-based robot control, we propose a novel framework capable of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Robotics and Sensor-Based Localization · Image Processing Techniques and Applications