MegaSaM: Accurate, Fast, and Robust Structure and Motion from Casual Dynamic Videos
Zhengqi Li, Richard Tucker, Forrester Cole, Qianqian Wang, Linyi Jin,, Vickie Ye, Angjoo Kanazawa, Aleksander Holynski, Noah Snavely

TL;DR
MegaSaM introduces a deep visual SLAM system that accurately, quickly, and robustly estimates camera motion and depth from casual, dynamic monocular videos, outperforming prior methods in real-world scenarios.
Contribution
The paper adapts a deep visual SLAM framework for dynamic scenes with minimal parallax, achieving high accuracy and robustness in challenging real-world videos.
Findings
Outperforms prior methods in accuracy and robustness
Operates efficiently on complex dynamic videos
Handles unconstrained camera motions effectively
Abstract
We present a system that allows for accurate, fast, and robust estimation of camera parameters and depth maps from casual monocular videos of dynamic scenes. Most conventional structure from motion and monocular SLAM techniques assume input videos that feature predominantly static scenes with large amounts of parallax. Such methods tend to produce erroneous estimates in the absence of these conditions. Recent neural network-based approaches attempt to overcome these challenges; however, such methods are either computationally expensive or brittle when run on dynamic videos with uncontrolled camera motion or unknown field of view. We demonstrate the surprising effectiveness of a deep visual SLAM framework: with careful modifications to its training and inference schemes, this system can scale to real-world videos of complex dynamic scenes with unconstrained camera paths, including videos…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Advanced Vision and Imaging · Medical Imaging and Analysis
