Multi-Agent Embodied Visual Semantic Navigation with Scene Prior   Knowledge

Xinzhu Liu; Di Guo; Huaping Liu; and Fuchun Sun

arXiv:2109.09531·cs.AI·September 21, 2021·1 cites

Multi-Agent Embodied Visual Semantic Navigation with Scene Prior Knowledge

Xinzhu Liu, Di Guo, Huaping Liu, and Fuchun Sun

PDF

Open Access

TL;DR

This paper introduces a multi-agent approach for visual semantic navigation that leverages scene prior knowledge and communication strategies to improve efficiency and accuracy in complex environments.

Contribution

It proposes a hierarchical decision framework enabling multiple agents to collaboratively navigate using scene priors and semantic mapping, addressing limitations of single-agent systems.

Findings

01

Higher accuracy in unseen scenes

02

Improved efficiency over single-agent models

03

Effective collaboration strategies learned

Abstract

In visual semantic navigation, the robot navigates to a target object with egocentric visual observations and the class label of the target is given. It is a meaningful task inspiring a surge of relevant research. However, most of the existing models are only effective for single-agent navigation, and a single agent has low efficiency and poor fault tolerance when completing more complicated tasks. Multi-agent collaboration can improve the efficiency and has strong application potentials. In this paper, we propose the multi-agent visual semantic navigation, in which multiple agents collaborate with others to find multiple target objects. It is a challenging task that requires agents to learn reasonable collaboration strategies to perform efficient exploration under the restrictions of communication bandwidth. We develop a hierarchical decision framework based on semantic mapping, scene…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques · Visual Attention and Saliency Detection