Self-Supervised Object Goal Navigation with In-Situ Finetuning

So Yeon Min; Yao-Hung Hubert Tsai; Wei Ding; Ali Farhadi; Ruslan; Salakhutdinov; Yonatan Bisk; Jian Zhang

arXiv:2212.05923·cs.RO·April 4, 2023·1 cites

Self-Supervised Object Goal Navigation with In-Situ Finetuning

So Yeon Min, Yao-Hung Hubert Tsai, Wei Ding, Ali Farhadi, Ruslan, Salakhutdinov, Yonatan Bisk, Jian Zhang

PDF

Open Access

TL;DR

This paper introduces a self-supervised approach for object goal navigation that enables robots to learn and adapt in real-world environments without relying on expensive labeled 3D data, using location consistency as a key training signal.

Contribution

The work proposes a novel self-supervised training method called LocCon that allows in-situ finetuning of navigation models in real environments without labeled data.

Findings

01

LocCon outperforms models trained with 3D mesh annotations in real-world transfer.

02

Self-supervised in-situ training improves real-world navigation performance.

03

Models trained with LocCon are more robust and less affected by simulation artifacts.

Abstract

A household robot should be able to navigate to target objects without requiring users to first annotate everything in their home. Most current approaches to object navigation do not test on real robots and rely solely on reconstructed scans of houses and their expensively labeled semantic 3D meshes. In this work, our goal is to build an agent that builds self-supervised models of the world via exploration, the same as a child might - thus we (1) eschew the expense of labeled 3D mesh and (2) enable self-supervised in-situ finetuning in the real world. We identify a strong source of self-supervision (Location Consistency - LocCon) that can train all components of an ObjectNav agent, using unannotated simulated houses. Our key insight is that embodied agents can leverage location consistency as a self-supervision signal - collecting images from different views/angles and applying…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobotics and Sensor-Based Localization · Multimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning

MethodsTest · Contrastive Learning