See What the Robot Can't See: Learning Cooperative Perception for Visual   Navigation

Jan Blumenkamp; Qingbiao Li; Binyu Wang; Zhe Liu; Amanda; Prorok

arXiv:2208.00759·cs.RO·August 1, 2023

See What the Robot Can't See: Learning Cooperative Perception for Visual Navigation

Jan Blumenkamp, Qingbiao Li, Binyu Wang, Zhe Liu, Amanda, Prorok

PDF

Open Access

TL;DR

This paper introduces a cooperative perception approach using sensor communication and graph neural networks to enable a robot to navigate to a target without global positioning, demonstrating improved success rates and real-world transferability.

Contribution

The authors propose a novel sensor communication framework with GNN-based feature aggregation for navigation without global maps or calibration, validated in simulation and real-world environments.

Findings

01

Up to 2.0x improvement in SPL over baselines

02

Generalizes to unseen environments and sensor layouts

03

Successfully transfers from simulation to real-world scenarios

Abstract

We consider the problem of navigating a mobile robot towards a target in an unknown environment that is endowed with visual sensors, where neither the robot nor the sensors have access to global positioning information and only use first-person-view images. In order to overcome the need for positioning, we train the sensors to encode and communicate relevant viewpoint information to the mobile robot, whose objective it is to use this information to navigate to the target along the shortest path. We overcome the challenge of enabling all the sensors (even those that cannot directly see the target) to predict the direction along the shortest path to the target by implementing a neighborhood-based feature aggregation module using a Graph Neural Network (GNN) architecture. In our experiments, we first demonstrate generalizability to previously unseen environments with various sensor…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques · Visual Attention and Saliency Detection

MethodsGraph Neural Network · Semi-Pseudo-Label