BEACON: Language-Conditioned Navigation Affordance Prediction under Occlusion

Xinyu Gao; Gang Chen; Javier Alonso-Mora

arXiv:2603.09961·cs.RO·March 11, 2026

BEACON: Language-Conditioned Navigation Affordance Prediction under Occlusion

Xinyu Gao, Gang Chen, Javier Alonso-Mora

PDF

Open Access

TL;DR

BEACON enables robots to predict navigable target locations in occluded areas by using a bird's-eye view heatmap conditioned on language instructions, improving spatial reasoning beyond visible regions.

Contribution

This work introduces a novel BEV affordance prediction method that incorporates spatial cues into vision-language models to handle occlusions in language-conditioned navigation.

Findings

01

22.74 percentage points improvement over state-of-the-art

02

Effective BEV formulation for occlusion reasoning

03

Validated on occlusion-rich dataset in Habitat simulator

Abstract

Language-conditioned local navigation requires a robot to infer a nearby traversable target location from its current observation and an open-vocabulary, relational instruction. Existing vision-language spatial grounding methods usually rely on vision-language models (VLMs) to reason in image space, producing 2D predictions tied to visible pixels. As a result, they struggle to infer target locations in occluded regions, typically caused by furniture or moving humans. To address this issue, we propose BEACON, which predicts an ego-centric Bird's-Eye View (BEV) affordance heatmap over a bounded local region including occluded areas. Given an instruction and surround-view RGB-D observations from four directions around the robot, BEACON predicts the BEV heatmap by injecting spatial cues into a VLM and fusing the VLM's output with depth-derived BEV features. Using an occlusion-aware dataset…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Robotics and Sensor-Based Localization · Advanced Neural Network Applications