RGB-Only Reconstruction of Tabletop Scenes for Collision-Free   Manipulator Control

Zhenggang Tang; Balakumar Sundaralingam; Jonathan Tremblay; Bowen Wen,; Ye Yuan; Stephen Tyree; Charles Loop; Alexander Schwing; Stan Birchfield

arXiv:2210.11668·cs.RO·March 13, 2023

RGB-Only Reconstruction of Tabletop Scenes for Collision-Free Manipulator Control

Zhenggang Tang, Balakumar Sundaralingam, Jonathan Tremblay, Bowen Wen,, Ye Yuan, Stephen Tyree, Charles Loop, Alexander Schwing, Stan Birchfield

PDF

Open Access

TL;DR

This paper introduces a system that enables a robot manipulator to navigate and avoid collisions in a tabletop scene using only RGB images, reconstructing 3D geometry with a NeRF-like approach and controlling the robot via model predictive control.

Contribution

The novel approach reconstructs 3D scene geometry from RGB images alone and integrates it with model predictive control for collision-free manipulation.

Findings

01

Successful 3D reconstruction from RGB images without depth.

02

Effective collision avoidance in real tabletop scenes.

03

Real-world dataset demonstrating system performance.

Abstract

We present a system for collision-free control of a robot manipulator that uses only RGB views of the world. Perceptual input of a tabletop scene is provided by multiple images of an RGB camera (without depth) that is either handheld or mounted on the robot end effector. A NeRF-like process is used to reconstruct the 3D geometry of the scene, from which the Euclidean full signed distance function (ESDF) is computed. A model predictive control algorithm is then used to control the manipulator to reach a desired pose while avoiding obstacles in the ESDF. We show results on a real dataset collected and annotated in our lab.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Human Pose and Action Recognition · Robotics and Sensor-Based Localization