Reliable Semantic Understanding for Real World Zero-shot Object Goal   Navigation

Halil Utku Unlu; Shuaihang Yuan; Congcong Wen; Hao Huang; Anthony Tzes; and Yi Fang

arXiv:2410.21926·cs.RO·October 30, 2024

Reliable Semantic Understanding for Real World Zero-shot Object Goal Navigation

Halil Utku Unlu, Shuaihang Yuan, Congcong Wen, Hao Huang, Anthony Tzes, and Yi Fang

PDF

Open Access

TL;DR

This paper presents a novel dual-model framework combining GLIP and InstructionBLIP to improve semantic understanding in zero-shot object goal navigation, significantly enhancing robot navigation in unfamiliar environments.

Contribution

The paper introduces a new dual-component approach that integrates vision-language models for better semantic recognition and validation in zero-shot navigation tasks.

Findings

01

Improved navigation accuracy in simulated environments.

02

Enhanced reliability of semantic recognition in real-world tests.

03

Significant performance gains over traditional methods.

Abstract

We introduce an innovative approach to advancing semantic understanding in zero-shot object goal navigation (ZS-OGN), enhancing the autonomy of robots in unfamiliar environments. Traditional reliance on labeled data has been a limitation for robotic adaptability, which we address by employing a dual-component framework that integrates a GLIP Vision Language Model for initial detection and an InstructionBLIP model for validation. This combination not only refines object and environmental recognition but also fortifies the semantic interpretation, pivotal for navigational decision-making. Our method, rigorously tested in both simulated and real-world settings, exhibits marked improvements in navigation precision and reliability.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAI-based Problem Solving and Planning · Robotic Path Planning Algorithms · Target Tracking and Data Fusion in Sensor Networks