SE3-Pose-Nets: Structured Deep Dynamics Models for Visuomotor Planning   and Control

Arunkumar Byravan; Felix Leeb; Franziska Meier; Dieter Fox

arXiv:1710.00489·cs.RO·October 3, 2017·31 cites

SE3-Pose-Nets: Structured Deep Dynamics Models for Visuomotor Planning and Control

Arunkumar Byravan, Felix Leeb, Franziska Meier, Dieter Fox

PDF

Open Access

TL;DR

This paper introduces a structured deep dynamics model called SE3-Pose-Nets for visuomotor control, which learns scene segmentation and pose prediction from point cloud data, enabling real-time control of robots in simulation and real-world settings.

Contribution

The work presents a novel structured deep dynamics model that explicitly segments scenes and predicts poses, improving control accuracy and efficiency over prior unstructured methods.

Findings

01

Achieves real-time control on Baxter robot from raw depth data

02

Outperforms baseline deep networks in scene prediction and control tasks

03

Successfully applies to both simulation and real-world scenarios

Abstract

In this work, we present an approach to deep visuomotor control using structured deep dynamics models. Our deep dynamics model, a variant of SE3-Nets, learns a low-dimensional pose embedding for visuomotor control via an encoder-decoder structure. Unlike prior work, our dynamics model is structured: given an input scene, our network explicitly learns to segment salient parts and predict their pose-embedding along with their motion modeled as a change in the pose space due to the applied actions. We train our model using a pair of point clouds separated by an action and show that given supervision only in the form of point-wise data associations between the frames our network is able to learn a meaningful segmentation of the scene along with consistent poses. We further show that our model can be used for closed-loop control directly in the learned low-dimensional pose space, where the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Human Pose and Action Recognition · Robot Manipulation and Learning