Contrastive Learning from Demonstrations

Andr\'e Correia; Lu\'is A. Alexandre

arXiv:2201.12813·cs.CV·January 30, 2023

Contrastive Learning from Demonstrations

Andr\'e Correia, Lu\'is A. Alexandre

PDF

Open Access

TL;DR

This paper introduces a contrastive self-supervised learning framework for extracting visual representations from unlabeled multi-view videos, improving robotic task imitation and reducing training time.

Contribution

It applies contrastive learning to multi-view video demonstrations for robotic tasks, enhancing representation quality and training efficiency.

Findings

01

Improved viewpoint alignment and stage classification accuracy.

02

Enhanced reinforcement learning performance.

03

Reduced training iterations compared to state-of-the-art methods.

Abstract

This paper presents a framework for learning visual representations from unlabeled video demonstrations captured from multiple viewpoints. We show that these representations are applicable for imitating several robotic tasks, including pick and place. We optimize a recently proposed self-supervised learning algorithm by applying contrastive learning to enhance task-relevant information while suppressing irrelevant information in the feature embeddings. We validate the proposed method on the publicly available Multi-View Pouring and a custom Pick and Place data sets and compare it with the TCN triplet baseline. We evaluate the learned representations using three metrics: viewpoint alignment, stage classification and reinforcement learning, and in all cases the results improve when compared to state-of-the-art approaches, with the added benefit of reduced number of training iterations.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Robotics and Sensor-Based Localization · Multimodal Machine Learning Applications

MethodsContrastive Learning