GISR: Geometric Initialization and Silhouette-based Refinement for Single-View Robot Pose and Configuration Estimation
Ivan Bili\'c, Filip Mari\'c, Fabio Bonsignorio, Ivan Petrovi\'c

TL;DR
GISR is a real-time deep learning method for estimating robot pose and configuration from a single view, combining geometric initialization with silhouette-based refinement for improved speed and accuracy.
Contribution
It introduces a novel two-module approach that efficiently estimates robot pose and configuration, outperforming existing methods in speed and accuracy.
Findings
Outperforms existing methods in speed and accuracy
Can recover robot configuration and pose from a single view
Operates in real-time with minimal iterations
Abstract
In autonomous robotics, measurement of the robot's internal state and perception of its environment, including interaction with other agents such as collaborative robots, are essential. Estimating the pose of the robot arm from a single view has the potential to replace classical eye-to-hand calibration approaches and is particularly attractive for online estimation and dynamic environments. In addition to its pose, recovering the robot configuration provides a complete spatial understanding of the observed robot that can be used to anticipate the actions of other agents in advanced robotics use cases. Furthermore, this additional redundancy enables the planning and execution of recovery protocols in case of sensor failures or external disturbances. We introduce GISR - a deep configuration and robot-to-camera pose estimation method that prioritizes execution in real-time. GISR consists…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotic Mechanisms and Dynamics · Robot Manipulation and Learning · Manufacturing Process and Optimization
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
