TL;DR
ProCompNav introduces a two-stage, discriminative questioning framework for natural-language instance navigation, significantly improving success rates and reducing user response length by actively narrowing down candidate pools.
Contribution
It presents a novel pool-level discriminative questioning approach using comparative judgment, outperforming existing methods in success rate and efficiency.
Findings
ProCompNav outperforms baselines on CoIN-Bench with higher success rates.
It achieves state-of-the-art success rate on TextNav.
The method reduces response length while maintaining high accuracy.
Abstract
Natural-language instance navigation becomes challenging when the initial user request does not uniquely specify the target instance. A practical agent should reduce the user's burden by actively asking only the information needed to distinguish the target from similar distractors, rather than requiring a detailed description upfront. Existing approaches often fall short of this goal: they may stop at the first plausible candidate before sufficiently exploring alternatives, or, even after collecting multiple candidates, ask about the target's attributes derived from individual candidates rather than questions selected to distinguish candidates in the pool. As a result, despite the dialogue, the agent may still fail to distinguish the target from distractors, leading to premature decisions and lengthy user responses. We propose Proactive Instance Navigation with Comparative Judgment…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
