FastMap: Revisiting Structure from Motion through First-Order Optimization
Jiahao Li, Haochen Wang, Muhammad Zubair Irshad, Igor Vasiljevic, Matthew R. Walter, Vitor Campagnolo Guizilini, Greg Shakhnarovich

TL;DR
FastMap introduces a fast, scalable structure from motion method using first-order optimization, significantly reducing computation time while maintaining accuracy, suitable for large-scale 3D reconstruction tasks.
Contribution
FastMap is a novel global structure from motion approach that replaces second-order optimization with first-order methods, enhancing speed and scalability.
Findings
FastMap is up to 10 times faster than COLMAP and GLOMAP with GPU acceleration.
FastMap achieves comparable camera pose accuracy to existing methods.
Eliminating key bottlenecks improves overall computational efficiency.
Abstract
We propose FastMap, a new global structure from motion method focused on speed and simplicity. Previous methods like COLMAP and GLOMAP are able to estimate high-precision camera poses, but suffer from poor scalability when the number of matched keypoint pairs becomes large, mainly due to the time-consuming process of second-order Gauss-Newton optimization. Instead, we design our method solely based on first-order optimizers. To obtain maximal speedup, we identify and eliminate two key performance bottlenecks: computational complexity and the kernel implementation of each optimization step. Through extensive experiments, we show that FastMap is up to 10 times faster than COLMAP and GLOMAP with GPU acceleration and achieves comparable pose accuracy.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotics and Sensor-Based Localization · Robot Manipulation and Learning · Advanced Vision and Imaging
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
