Cross-Modal Reinforcement Learning for Navigation with Degraded Depth Measurements
Omkar Sawant, Luca Zanatta, Grzegorz Malczyk, Kostas Alexis

TL;DR
This paper introduces a cross-modal reinforcement learning framework that combines depth and grayscale images to enable robust navigation even when depth sensors are degraded by environmental conditions.
Contribution
It proposes a Cross-Modal Wasserstein Autoencoder that learns shared representations, allowing depth information to be inferred from grayscale images during sensor degradation.
Findings
Maintains navigation performance under significant depth sensor degradation
Successfully transfers from simulation to real-world environments
Outperforms baseline methods in robustness tests
Abstract
This paper presents a cross-modal learning framework that exploits complementary information from depth and grayscale images for robust navigation. We introduce a Cross-Modal Wasserstein Autoencoder that learns shared latent representations by enforcing cross-modal consistency, enabling the system to infer depth-relevant features from grayscale observations when depth measurements are corrupted. The learned representations are integrated with a Reinforcement Learning-based policy for collision-free navigation in unstructured environments when depth sensors experience degradation due to adverse conditions such as poor lighting or reflective surfaces. Simulation and real-world experiments demonstrate that our approach maintains robust performance under significant depth degradation and successfully transfers to real environments.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Robotics and Sensor-Based Localization · Robot Manipulation and Learning
