Multi-Agent Embodied Visual Semantic Navigation with Scene Prior Knowledge
Xinzhu Liu, Di Guo, Huaping Liu, and Fuchun Sun

TL;DR
This paper introduces a multi-agent approach for visual semantic navigation that leverages scene prior knowledge and communication strategies to improve efficiency and accuracy in complex environments.
Contribution
It proposes a hierarchical decision framework enabling multiple agents to collaboratively navigate using scene priors and semantic mapping, addressing limitations of single-agent systems.
Findings
Higher accuracy in unseen scenes
Improved efficiency over single-agent models
Effective collaboration strategies learned
Abstract
In visual semantic navigation, the robot navigates to a target object with egocentric visual observations and the class label of the target is given. It is a meaningful task inspiring a surge of relevant research. However, most of the existing models are only effective for single-agent navigation, and a single agent has low efficiency and poor fault tolerance when completing more complicated tasks. Multi-agent collaboration can improve the efficiency and has strong application potentials. In this paper, we propose the multi-agent visual semantic navigation, in which multiple agents collaborate with others to find multiple target objects. It is a challenging task that requires agents to learn reasonable collaboration strategies to perform efficient exploration under the restrictions of communication bandwidth. We develop a hierarchical decision framework based on semantic mapping, scene…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques · Visual Attention and Saliency Detection
