Rule-VLN: Bridging Perception and Compliance via Semantic Reasoning and Geometric Rectification

Jiawen Wen; Penglei Sun; Wenjie Zhang; Suixuan Qiu; Weisheng Xu; Xiaofei Yang; Xiaowen Chu

arXiv:2604.16993·cs.AI·April 21, 2026

Rule-VLN: Bridging Perception and Compliance via Semantic Reasoning and Geometric Rectification

Jiawen Wen, Penglei Sun, Wenjie Zhang, Suixuan Qiu, Weisheng Xu, Xiaofei Yang, Xiaowen Chu

PDF

TL;DR

Rule-VLN introduces a large-scale urban benchmark for rule-compliant navigation and proposes SNRM, a module that enhances safety awareness in pre-trained embodied AI agents, addressing semantic and regulatory constraints.

Contribution

The paper presents the first large-scale urban benchmark for rule-compliant navigation and a universal zero-shot safety module, SNRM, to improve regulatory adherence in embodied AI.

Findings

01

Rule-VLN challenges current models with diverse regulatory constraints.

02

SNRM significantly reduces goal violation rate by 19.26%.

03

Navigation success rate improves by 5.97% with SNRM.

Abstract

As embodied AI transitions to real-world deployment, the success of the Vision-and-Language Navigation (VLN) task tends to evolve from mere reachability to social compliance. However, current agents suffer from a "goal-driven trap", prioritizing physical geometry ("can I go?") over semantic rules ("may I go?"), frequently overlooking subtle regulatory constraints. To bridge this gap, we establish Rule-VLN, the first large-scale urban benchmark for rule-compliant navigation. Spanning a massive 29k-node environment, it injects 177 diverse regulatory categories into 8k constrained nodes across four curriculum levels, challenging agents with fine-grained visual and behavioral constraints. We further propose the Semantic Navigation Rectification Module (SNRM), a universal, zero-shot module designed to equip pre-trained agents with safety awareness. SNRM integrates a coarse-to-fine visual…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.