Towards Live 3D Reconstruction from Wearable Video: An Evaluation of V-SLAM, NeRF, and Videogrammetry Techniques
David Ramirez, Suren Jayasuriya, Andreas Spanias

TL;DR
This paper evaluates the performance of various 3D reconstruction algorithms like V-SLAM, NeRF, and Videogrammetry for live large-scale environment mapping using wearable video in military mixed reality applications.
Contribution
It provides a quantitative analysis of the computational speed of different 3D reconstruction methods on live video data, focusing on military and large-scale environments.
Findings
NeRF with Instant-NGP offers high-quality 3D reconstructions.
V-SLAM algorithms like ORB-SLAM3 demonstrate faster processing speeds.
Trade-offs exist between reconstruction quality and speed for real-time applications.
Abstract
Mixed reality (MR) is a key technology which promises to change the future of warfare. An MR hybrid of physical outdoor environments and virtual military training will enable engagements with long distance enemies, both real and simulated. To enable this technology, a large-scale 3D model of a physical environment must be maintained based on live sensor observations. 3D reconstruction algorithms should utilize the low cost and pervasiveness of video camera sensors, from both overhead and soldier-level perspectives. Mapping speed and 3D quality can be balanced to enable live MR training in dynamic environments. Given these requirements, we survey several 3D reconstruction algorithms for large-scale mapping for military applications given only live video. We measure 3D reconstruction performance from common structure from motion, visual-SLAM, and photogrammetry techniques. This includes…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Optical Sensing Technologies · Robotics and Sensor-Based Localization · Advanced Vision and Imaging
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
