CLUE: Adaptively Prioritized Contextual Cues by Leveraging a Unified Semantic Map for Effective Zero-Shot Object-Goal Navigation
Taeyun Kim, Alvin Jinsung Choi, Dasol Hong, Hyun Myung

TL;DR
CLUE is a novel zero-shot object-goal navigation framework that adaptively balances room and object cues using LLM-derived commonsense knowledge, improving navigation success and efficiency.
Contribution
The paper introduces CLUE, which leverages an offline large language model to adaptively prioritize contextual cues, constructing a unified semantic map for improved navigation.
Findings
Outperforms state-of-the-art methods in success rate and SPL.
Effectively balances room and object cues based on target ambiguity.
Demonstrates robustness in both simulation and real-world tests.
Abstract
Zero-shot object-goal navigation (ZSON) is a challenging problem in robotics that requires a comprehensive understanding of both language and visual observations. Contextual cues from rooms and objects are critical, but their relative importance depends on the target: some objects are strongly tied to specific room types, while others are better predicted by nearby co-located objects. Existing methods overlook this distinction, leading to inefficient and inaccurate exploration. We present CLUE, a novel navigation framework that adaptively balances the use of contextual rooms and objects by leveraging commonsense knowledge extracted from an offline large language model (LLM). By estimating a target's association with room types using LLM, the agent prioritizes room cues for predictable objects and object cues for those with weak room associations. Our framework constructs a unified…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
