GISR: Geometric Initialization and Silhouette-based Refinement for   Single-View Robot Pose and Configuration Estimation

Ivan Bili\'c; Filip Mari\'c; Fabio Bonsignorio; Ivan Petrovi\'c

arXiv:2405.04890·cs.RO·January 3, 2025

GISR: Geometric Initialization and Silhouette-based Refinement for Single-View Robot Pose and Configuration Estimation

Ivan Bili\'c, Filip Mari\'c, Fabio Bonsignorio, Ivan Petrovi\'c

PDF

Open Access

TL;DR

GISR is a real-time deep learning method for estimating robot pose and configuration from a single view, combining geometric initialization with silhouette-based refinement for improved speed and accuracy.

Contribution

It introduces a novel two-module approach that efficiently estimates robot pose and configuration, outperforming existing methods in speed and accuracy.

Findings

01

Outperforms existing methods in speed and accuracy

02

Can recover robot configuration and pose from a single view

03

Operates in real-time with minimal iterations

Abstract

In autonomous robotics, measurement of the robot's internal state and perception of its environment, including interaction with other agents such as collaborative robots, are essential. Estimating the pose of the robot arm from a single view has the potential to replace classical eye-to-hand calibration approaches and is particularly attractive for online estimation and dynamic environments. In addition to its pose, recovering the robot configuration provides a complete spatial understanding of the observed robot that can be used to anticipate the actions of other agents in advanced robotics use cases. Furthermore, this additional redundancy enables the planning and execution of recovery protocols in case of sensor failures or external disturbances. We introduce GISR - a deep configuration and robot-to-camera pose estimation method that prioritizes execution in real-time. GISR consists…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobotic Mechanisms and Dynamics · Robot Manipulation and Learning · Manufacturing Process and Optimization

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings