Cross-Modal Reinforcement Learning for Navigation with Degraded Depth Measurements

Omkar Sawant; Luca Zanatta; Grzegorz Malczyk; Kostas Alexis

arXiv:2603.22182·cs.RO·March 24, 2026

Cross-Modal Reinforcement Learning for Navigation with Degraded Depth Measurements

Omkar Sawant, Luca Zanatta, Grzegorz Malczyk, Kostas Alexis

PDF

Open Access

TL;DR

This paper introduces a cross-modal reinforcement learning framework that combines depth and grayscale images to enable robust navigation even when depth sensors are degraded by environmental conditions.

Contribution

It proposes a Cross-Modal Wasserstein Autoencoder that learns shared representations, allowing depth information to be inferred from grayscale images during sensor degradation.

Findings

01

Maintains navigation performance under significant depth sensor degradation

02

Successfully transfers from simulation to real-world environments

03

Outperforms baseline methods in robustness tests

Abstract

This paper presents a cross-modal learning framework that exploits complementary information from depth and grayscale images for robust navigation. We introduce a Cross-Modal Wasserstein Autoencoder that learns shared latent representations by enforcing cross-modal consistency, enabling the system to infer depth-relevant features from grayscale observations when depth measurements are corrupted. The learned representations are integrated with a Reinforcement Learning-based policy for collision-free navigation in unstructured environments when depth sensors experience degradation due to adverse conditions such as poor lighting or reflective surfaces. Simulation and real-world experiments demonstrate that our approach maintains robust performance under significant depth degradation and successfully transfers to real environments.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Robotics and Sensor-Based Localization · Robot Manipulation and Learning