See What the Robot Can't See: Learning Cooperative Perception for Visual Navigation
Jan Blumenkamp, Qingbiao Li, Binyu Wang, Zhe Liu, Amanda, Prorok

TL;DR
This paper introduces a cooperative perception approach using sensor communication and graph neural networks to enable a robot to navigate to a target without global positioning, demonstrating improved success rates and real-world transferability.
Contribution
The authors propose a novel sensor communication framework with GNN-based feature aggregation for navigation without global maps or calibration, validated in simulation and real-world environments.
Findings
Up to 2.0x improvement in SPL over baselines
Generalizes to unseen environments and sensor layouts
Successfully transfers from simulation to real-world scenarios
Abstract
We consider the problem of navigating a mobile robot towards a target in an unknown environment that is endowed with visual sensors, where neither the robot nor the sensors have access to global positioning information and only use first-person-view images. In order to overcome the need for positioning, we train the sensors to encode and communicate relevant viewpoint information to the mobile robot, whose objective it is to use this information to navigate to the target along the shortest path. We overcome the challenge of enabling all the sensors (even those that cannot directly see the target) to predict the direction along the shortest path to the target by implementing a neighborhood-based feature aggregation module using a Graph Neural Network (GNN) architecture. In our experiments, we first demonstrate generalizability to previously unseen environments with various sensor…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques · Visual Attention and Saliency Detection
MethodsGraph Neural Network · Semi-Pseudo-Label
