MA-CoNav: A Master-Slave Multi-Agent Framework with Hierarchical Collaboration and Dual-Level Reflection for Long-Horizon Embodied VLN
Ling Luo, Qianqian Bai

TL;DR
MA-CoNav introduces a hierarchical multi-agent framework with dual-level reflection for improved long-horizon vision-language navigation, effectively distributing perception, planning, and memory functions to enhance performance in complex environments.
Contribution
The paper proposes a novel Master-Slave multi-agent architecture with a dual-stage reflection mechanism, enabling better distribution of tasks and dynamic optimization in VLN tasks.
Findings
Outperforms existing VLN methods on real-world indoor datasets
No scene-specific fine-tuning required for the models
Demonstrates significant improvements across multiple evaluation metrics
Abstract
Vision-Language Navigation (VLN) aims to empower robots with the ability to perform long-horizon navigation in unfamiliar environments based on complex linguistic instructions. Its success critically hinges on establishing an efficient ``language-understanding -- visual-perception -- embodied-execution'' closed loop. Existing methods often suffer from perceptual distortion and decision drift in complex, long-distance tasks due to the cognitive overload of a single agent. Inspired by distributed cognition theory, this paper proposes MA-CoNav, a Multi-Agent Collaborative Navigation framework. This framework adopts a ``Master-Slave'' hierarchical agent collaboration architecture, decoupling and distributing the perception, planning, execution, and memory functions required for navigation tasks to specialized agents. Specifically, the Master Agent is responsible for global orchestration,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Robotics and Sensor-Based Localization · Robotic Path Planning Algorithms
