Dynamic Visual SLAM using a General 3D Prior
Xingguang Zhong, Liren Jin, Marija Popovi\'c, Jens Behley, Cyrill Stachniss

TL;DR
This paper introduces a monocular visual SLAM system capable of robustly estimating camera poses in dynamic environments by integrating a feed-forward reconstruction model with patch-based bundle adjustment, effectively filtering out dynamic regions and handling scale ambiguities.
Contribution
It presents a novel combination of geometric patch-based bundle adjustment and feed-forward reconstruction models to improve SLAM robustness in dynamic scenes.
Findings
Enhanced robustness in dynamic environments
Effective filtering of dynamic regions
Improved accuracy of camera pose estimation
Abstract
Reliable incremental estimation of camera poses and 3D reconstruction is key to enable various applications including robotics, interactive visualization, and augmented reality. However, this task is particularly challenging in dynamic natural environments, where scene dynamics can severely deteriorate camera pose estimation accuracy. In this work, we propose a novel monocular visual SLAM system that can robustly estimate camera poses in dynamic scenes. To this end, we leverage the complementary strengths of geometric patch-based online bundle adjustment and recent feed-forward reconstruction models. Specifically, we propose a feed-forward reconstruction model to precisely filter out dynamic regions, while also utilizing its depth prediction to enhance the robustness of the patch-based visual SLAM. By aligning depth prediction with estimated patches from bundle adjustment, we robustly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotics and Sensor-Based Localization · Advanced Vision and Imaging · Robot Manipulation and Learning
