RiEMann: Near Real-Time SE(3)-Equivariant Robot Manipulation without   Point Cloud Segmentation

Chongkai Gao; Zhengrong Xue; Shuying Deng; Tianhai Liang; Siqi Yang,; Lin Shao; Huazhe Xu

arXiv:2403.19460·cs.RO·October 4, 2024·1 cites

RiEMann: Near Real-Time SE(3)-Equivariant Robot Manipulation without Point Cloud Segmentation

Chongkai Gao, Zhengrong Xue, Shuying Deng, Tianhai Liang, Siqi Yang,, Lin Shao, Huazhe Xu

PDF

Open Access

TL;DR

RiEMann is a real-time SE(3)-equivariant imitation learning framework for robot manipulation from scene point clouds, capable of generalizing to unseen objects and transformations without segmentation, and outperforming baselines in success rate and pose accuracy.

Contribution

RiEMann introduces a novel end-to-end framework that predicts target poses directly from point clouds, eliminating the need for segmentation and enabling near real-time manipulation.

Findings

01

Outperforms baselines in success rates and pose accuracy.

02

Achieves 5.4 FPS inference speed.

03

Generalizes to unseen objects and transformations.

Abstract

We present RiEMann, an end-to-end near Real-time SE(3)-Equivariant Robot Manipulation imitation learning framework from scene point cloud input. Compared to previous methods that rely on descriptor field matching, RiEMann directly predicts the target poses of objects for manipulation without any object segmentation. RiEMann learns a manipulation task from scratch with 5 to 10 demonstrations, generalizes to unseen SE(3) transformations and instances of target objects, resists visual interference of distracting objects, and follows the near real-time pose change of the target object. The scalable action space of RiEMann facilitates the addition of custom equivariant actions such as the direction of turning the faucet, which makes articulated object manipulation possible for RiEMann. In simulation and real-world 6-DOF robot manipulation experiments, we test RiEMann on 5 categories of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Processing and 3D Reconstruction · Robot Manipulation and Learning · 3D Shape Modeling and Analysis