Learning 3D Dynamic Scene Representations for Robot Manipulation

Zhenjia Xu; Zhanpeng He; Jiajun Wu; Shuran Song

arXiv:2011.01968·cs.RO·December 11, 2020·21 cites

Learning 3D Dynamic Scene Representations for Robot Manipulation

Zhenjia Xu, Zhanpeng He, Jiajun Wu, Shuran Song

PDF

Open Access 2 Repos

TL;DR

This paper introduces DSR, a 3D scene representation capturing object permanency, amodal completeness, and spatiotemporal continuity, enabling improved robot manipulation through learned dynamic scene understanding.

Contribution

The paper presents DSR and DSR-Net, novel models for dynamic 3D scene representation and tracking that enhance robotic manipulation capabilities.

Findings

01

Achieves state-of-the-art performance in modeling 3D scene dynamics.

02

Enables accurate planning in robotic manipulation tasks.

03

Works effectively on both simulated and real data.

Abstract

3D scene representation for robot manipulation should capture three key object properties: permanency -- objects that become occluded over time continue to exist; amodal completeness -- objects have 3D occupancy, even if only partial observations are available; spatiotemporal continuity -- the movement of each object is continuous over space and time. In this paper, we introduce 3D Dynamic Scene Representation (DSR), a 3D volumetric scene representation that simultaneously discovers, tracks, reconstructs objects, and predicts their dynamics while capturing all three properties. We further propose DSR-Net, which learns to aggregate visual observations over multiple interactions to gradually build and refine DSR. Our model achieves state-of-the-art performance in modeling 3D scene dynamics with DSR on both simulated and real data. Combined with model predictive control, DSR-Net enables…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Advanced Vision and Imaging · 3D Shape Modeling and Analysis