IROS: A Dual-Process Architecture for Real-Time VLM-Based Indoor Navigation
Joonhee Lee, Hyunseung Shin, Jeonggil Ko

TL;DR
IROS is a real-time indoor navigation system that combines fast reflexive responses with deliberate reasoning using lightweight vision-language models, enabling robust and efficient robot navigation in complex environments.
Contribution
The paper introduces a dual-process architecture for indoor navigation that integrates VLMs with perceptual modules, reducing latency and improving decision accuracy on embedded hardware.
Findings
Reduces navigation latency by 66% compared to continuous VLM use.
Enhances decision accuracy in real-world indoor environments.
Operates efficiently on low-cost, on-device hardware.
Abstract
Indoor mobile robot navigation requires fast responsiveness and robust semantic understanding, yet existing methods struggle to provide both. Classical geometric approaches such as SLAM offer reliable localization but depend on detailed maps and cannot interpret human-targeted cues (e.g., signs, room numbers) essential for indoor reasoning. Vision-Language-Action (VLA) models introduce semantic grounding but remain strictly reactive, basing decisions only on visible frames and failing to anticipate unseen intersections or reason about distant textual cues. Vision-Language Models (VLMs) provide richer contextual inference but suffer from high computational latency, making them unsuitable for real-time operation on embedded platforms. In this work, we present IROS, a real-time navigation framework that combines VLM-level contextual reasoning with the efficiency of lightweight perceptual…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Robotics and Sensor-Based Localization · Advanced Neural Network Applications
