Enhancing Multi-Robot Semantic Navigation Through Multimodal Chain-of-Thought Score Collaboration
Zhixuan Shen, Haonan Luo, Kexun Chen, Fengmao Lv, Tianrui Li

TL;DR
This paper introduces MCoCoNav, a modular multimodal Chain-of-Thought approach that enhances multi-robot semantic navigation by improving exploration efficiency and reducing communication costs through probabilistic scoring and global semantic maps.
Contribution
It presents a novel multimodal Chain-of-Thought framework for multi-robot navigation that integrates visual perception, language models, and semantic mapping to improve exploration and communication efficiency.
Findings
Demonstrates improved exploration efficiency on HM3D and MP3D datasets.
Reduces communication overhead with a global semantic map.
Achieves stable navigation outputs with probabilistic scoring.
Abstract
Understanding how humans cooperatively utilize semantic knowledge to explore unfamiliar environments and decide on navigation directions is critical for house service multi-robot systems. Previous methods primarily focused on single-robot centralized planning strategies, which severely limited exploration efficiency. Recent research has considered decentralized planning strategies for multiple robots, assigning separate planning models to each robot, but these approaches often overlook communication costs. In this work, we propose Multimodal Chain-of-Thought Co-Navigation (MCoCoNav), a modular approach that utilizes multimodal Chain-of-Thought to plan collaborative semantic navigation for multiple robots. MCoCoNav combines visual perception with Vision Language Models (VLMs) to evaluate exploration value through probabilistic scoring, thus reducing time costs and achieving stable…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsSemantic Web and Ontologies · Robotics and Automated Systems · Speech and dialogue systems
