A Survey on Improving Human Robot Collaboration through Vision-and-Language Navigation
Nivedan Yakolli, Avinash Gautam, Abhijit Das, Yuankai Qi, Virendra Singh Shekhawat

TL;DR
This survey reviews recent advancements in Vision-and-Language Navigation (VLN) for robotics, highlighting challenges and proposing future directions to improve multi-robot collaboration and human-robot interaction.
Contribution
It provides an extensive review of nearly 200 articles on VLN in robotics and outlines promising research directions for enhancing multi-robot coordination and communication.
Findings
Current models struggle with ambiguity resolution and bidirectional communication.
Decentralized decision-making and dynamic role assignment are crucial for scalable collaboration.
Future systems should support proactive clarification and real-time feedback.
Abstract
Vision-and-Language Navigation (VLN) is a multi-modal, cooperative task requiring agents to interpret human instructions, navigate 3D environments, and communicate effectively under ambiguity. This paper presents a comprehensive review of recent VLN advancements in robotics and outlines promising directions to improve multi-robot coordination. Despite progress, current models struggle with bidirectional communication, ambiguity resolution, and collaborative decision-making in the multi-agent systems. We review approximately 200 relevant articles to provide an in-depth understanding of the current landscape. Through this survey, we aim to provide a thorough resource that inspires further research at the intersection of VLN and robotics. We advocate that the future VLN systems should support proactive clarification, real-time feedback, and contextual reasoning through advanced natural…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Social Robot Interaction and HRI · Advanced Neural Network Applications
