Unsupervised Learning of Visual 3D Keypoints for Control

Boyuan Chen; Pieter Abbeel; Deepak Pathak

arXiv:2106.07643·cs.LG·June 15, 2021·1 cites

Unsupervised Learning of Visual 3D Keypoints for Control

Boyuan Chen, Pieter Abbeel, Deepak Pathak

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces an unsupervised method to learn 3D visual keypoints directly from images, improving robotic control by capturing meaningful 3D structures for better policy learning.

Contribution

It presents a novel end-to-end framework that learns 3D geometric keypoints from images without supervision, outperforming existing 2D-based methods in control tasks.

Findings

01

Outperforms prior state-of-the-art methods in reinforcement learning benchmarks.

02

Learns meaningful 3D keypoints that capture robot joints and object movements.

03

Demonstrates the effectiveness of 3D structure learning in control environments.

Abstract

Learning sensorimotor control policies from high-dimensional images crucially relies on the quality of the underlying visual representations. Prior works show that structured latent space such as visual keypoints often outperforms unstructured representations for robotic control. However, most of these representations, whether structured or unstructured are learned in a 2D space even though the control tasks are usually performed in a 3D environment. In this work, we propose a framework to learn such a 3D geometric structure directly from images in an end-to-end unsupervised manner. The input images are embedded into latent 3D keypoints via a differentiable encoder which is trained to optimize both a multi-view consistency loss and downstream task objective. These discovered 3D keypoints tend to meaningfully capture robot joints as well as object movements in a consistent manner across…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

buoyancy99/unsup-3d-keypoints
pytorchOfficial

Videos

Unsupervised Learning of Visual 3D Keypoints for Control· slideslive

Taxonomy

TopicsMultimodal Machine Learning Applications · Reinforcement Learning in Robotics · Human Pose and Action Recognition