Visual Depth Mapping from Monocular Images using Recurrent Convolutional Neural Networks
John Mern, Kyle Julian, Rachael E. Tompa, Mykel J. Kochenderfer

TL;DR
This paper introduces a deep recurrent convolutional neural network that estimates depth maps from monocular video sequences, enabling low-cost collision avoidance for small unmanned aircraft.
Contribution
It presents a novel neural network architecture trained on simulated data to generate accurate depth maps for sense-and-avoid systems using monocular cameras.
Findings
Achieves superior depth estimation performance over prior methods
Successfully demonstrates obstacle avoidance in simulation
Uses simulated data for training, reducing real-world data requirements
Abstract
A reliable sense-and-avoid system is critical to enabling safe autonomous operation of unmanned aircraft. Existing sense-and-avoid methods often require specialized sensors that are too large or power intensive for use on small unmanned vehicles. This paper presents a method to estimate object distances based on visual image sequences, allowing for the use of low-cost, on-board monocular cameras as simple collision avoidance sensors. We present a deep recurrent convolutional neural network and training method to generate depth maps from video sequences. Our network is trained using simulated camera and depth data generated with Microsoft's AirSim simulator. Empirically, we show that our model achieves superior performance compared to models generated using prior methods.We further demonstrate that the method can be used for sense-and-avoid of obstacles in simulation.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Robotics and Sensor-Based Localization · Robotic Path Planning Algorithms
