Self-Supervised Learning of Multi-Object Keypoints for Robotic   Manipulation

Jan Ole von Hartz; Eugenio Chisari; Tim Welschehold; Abhinav Valada

arXiv:2205.08316·cs.RO·October 12, 2022·1 cites

Self-Supervised Learning of Multi-Object Keypoints for Robotic Manipulation

Jan Ole von Hartz, Eugenio Chisari, Tim Welschehold, Abhinav Valada

PDF

Open Access

TL;DR

This paper introduces a method for learning multi-object keypoints from raw images using dense correspondence, improving sample efficiency and robustness for robotic manipulation tasks.

Contribution

It extends prior keypoint learning methods to multi-object scenes, addressing scale-invariance and occlusion, and demonstrates improved policy learning from raw camera data.

Findings

01

Effective keypoint learning in multi-object scenes

02

Enhanced robustness to scale and occlusion

03

Sample-efficient policy learning demonstrated

Abstract

In recent years, policy learning methods using either reinforcement or imitation have made significant progress. However, both techniques still suffer from being computationally expensive and requiring large amounts of training data. This problem is especially prevalent in real-world robotic manipulation tasks, where access to ground truth scene features is not available and policies are instead learned from raw camera observations. In this paper, we demonstrate the efficacy of learning image keypoints via the Dense Correspondence pretext task for downstream policy learning. Extending prior work to challenging multi-object scenes, we show that our model can be trained to deal with important problems in representation learning, primarily scale-invariance and occlusion. We evaluate our approach on diverse robot manipulation tasks, compare it to other visual representation learning…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques