Uncertainty-Informed Active Perception for Open Vocabulary Object Goal Navigation

Utkarsh Bajpai; Julius R\"uckin; Cyrill Stachniss; Marija Popovi\'c

arXiv:2506.13367·cs.RO·October 9, 2025

Uncertainty-Informed Active Perception for Open Vocabulary Object Goal Navigation

Utkarsh Bajpai, Julius R\"uckin, Cyrill Stachniss, Marija Popovi\'c

PDF

TL;DR

This paper introduces a probabilistic perception framework that quantifies semantic uncertainty in vision-language models to improve open vocabulary object goal navigation in indoor environments, leading to more efficient exploration.

Contribution

It presents a novel semantic uncertainty model and integrates it into a probabilistic map and exploration planner for better robot navigation without extensive prompt engineering.

Findings

01

Achieves success rates comparable to state-of-the-art methods.

02

Effectively quantifies semantic uncertainty in vision-language perception.

03

Enhances exploration efficiency in indoor object navigation.

Abstract

Mobile robots exploring indoor environments increasingly rely on vision-language models to perceive high-level semantic cues in camera images, such as object categories. Such models offer the potential to substantially advance robot behaviour for tasks such as object-goal navigation (ObjectNav), where the robot must locate objects specified in natural language by exploring the environment. Current ObjectNav methods heavily depend on prompt engineering for perception and do not address the semantic uncertainty induced by variations in prompt phrasing. Ignoring semantic uncertainty can lead to suboptimal exploration, which in turn limits performance. Hence, we propose a semantic uncertainty-informed active perception pipeline for ObjectNav in indoor environments. We introduce a novel probabilistic sensor model for quantifying semantic uncertainty in vision-language models and incorporate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.