Visuomotor Control in Multi-Object Scenes Using Object-Aware   Representations

Negin Heravi; Ayzaan Wahid; Corey Lynch; Pete Florence; Travis; Armstrong; Jonathan Tompson; Pierre Sermanet; Jeannette Bohg; Debidatta; Dwibedi

arXiv:2205.06333·cs.RO·March 14, 2023

Visuomotor Control in Multi-Object Scenes Using Object-Aware Representations

Negin Heravi, Ayzaan Wahid, Corey Lynch, Pete Florence, Travis, Armstrong, Jonathan Tompson, Pierre Sermanet, Jeannette Bohg, Debidatta, Dwibedi

PDF

Open Access

TL;DR

This paper demonstrates that object-aware self-supervised representations significantly improve robotic control and object localization in multi-object scenes, especially in low-data scenarios, compared to object-agnostic methods.

Contribution

It introduces an object-aware self-supervised learning approach for robotic tasks, enhancing control and localization performance over existing object-agnostic techniques.

Findings

01

20% performance increase in low-data policy training

02

Outperforms object-agnostic methods in scene understanding

03

Effective in multi-object scene control and localization

Abstract

Perceptual understanding of the scene and the relationship between its different components is important for successful completion of robotic tasks. Representation learning has been shown to be a powerful technique for this, but most of the current methodologies learn task specific representations that do not necessarily transfer well to other tasks. Furthermore, representations learned by supervised methods require large labeled datasets for each task that are expensive to collect in the real world. Using self-supervised learning to obtain representations from unlabeled data can mitigate this problem. However, current self-supervised representation learning methods are mostly object agnostic, and we demonstrate that the resulting representations are insufficient for general purpose robotics tasks as they fail to capture the complexity of scenes with many components. In this paper, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Human Pose and Action Recognition