Depth Transfer: Learning to See Like a Simulator for Real-World Drone Navigation
Hang Yu, Christophe De Wagter, Guido C. H. E de Croon

TL;DR
This paper introduces a depth transfer method using domain adaptation and VAEs to improve sim-to-real transfer for drone navigation, significantly enhancing obstacle avoidance and enabling robust real-world deployment.
Contribution
A novel depth transfer approach that aligns simulated and real stereo depth data, facilitating direct policy transfer without fine-tuning in drone navigation tasks.
Findings
Nearly doubles obstacle avoidance success rate in simulation.
Achieves superior performance over baselines in photo-realistic simulator.
Demonstrates effective real-world navigation in indoor and outdoor environments.
Abstract
Sim-to-real transfer is a fundamental challenge in robot reinforcement learning. Discrepancies between simulation and reality can significantly impair policy performance, especially if it receives high-dimensional inputs such as dense depth estimates from vision. We propose a novel depth transfer method based on domain adaptation to bridge the visual gap between simulated and real-world depth data. A Variational Autoencoder (VAE) is first trained to encode ground-truth depth images from simulation into a latent space, which serves as input to a reinforcement learning (RL) policy. During deployment, the encoder is refined to align stereo depth images with this latent space, enabling direct policy transfer without fine-tuning. We apply our method to the task of autonomous drone navigation through cluttered environments. Experiments in IsaacGym show that our method nearly doubles the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Reinforcement Learning in Robotics · Multimodal Machine Learning Applications
