Real-time dense 3D Reconstruction from monocular video data captured by low-cost UAVs
Max Hermann, Boitumelo Ruf, Martin Weinmann

TL;DR
This paper presents a real-time dense 3D reconstruction method from monocular video data captured by low-cost UAVs, enabling fast environment mapping without explicit depth sensors, suitable for navigation and emergency response.
Contribution
The approach introduces a novel real-time 3D reconstruction pipeline that relies solely on monocular video and intrinsic calibration, integrating SLAM and Multi-View Stereo for UAV-based mapping.
Findings
Achieves real-time performance at 30 fps for 768x448 resolution.
Produces competitive qualitative and quantitative 3D reconstructions.
Effectively estimates camera trajectory and depth without explicit depth sensors.
Abstract
Real-time 3D reconstruction enables fast dense mapping of the environment which benefits numerous applications, such as navigation or live evaluation of an emergency. In contrast to most real-time capable approaches, our approach does not need an explicit depth sensor. Instead, we only rely on a video stream from a camera and its intrinsic calibration. By exploiting the self-motion of the unmanned aerial vehicle (UAV) flying with oblique view around buildings, we estimate both camera trajectory and depth for selected images with enough novel content. To create a 3D model of the scene, we rely on a three-stage processing chain. First, we estimate the rough camera trajectory using a simultaneous localization and mapping (SLAM) algorithm. Once a suitable constellation is found, we estimate depth for local bundles of images using a Multi-View Stereo (MVS) approach and then fuse this depth…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
