GGD-SLAM: Monocular 3DGS SLAM Powered by Generalizable Motion Model for Dynamic Environments
Yi Liu, Haoxuan Xu, Hongbo Duan, Keyu Fan, Zhengyang Zhang, Peiyu Zhuang, Pengting Luo, Houde Liu

TL;DR
GGD-SLAM introduces a novel monocular SLAM framework that effectively handles dynamic environments by employing a generalizable motion model, dynamic feature separation, and occlusion filling, achieving state-of-the-art results.
Contribution
The paper proposes a dynamic SLAM system that does not rely on semantic annotations or depth input, using a motion model and dynamic feature enhancement for improved performance.
Findings
Achieves state-of-the-art camera pose estimation in dynamic scenes.
Effectively separates static and dynamic components for dense mapping.
Improves robustness against dynamic distractors with a novel SSIM loss.
Abstract
Visual SLAM algorithms achieve significant improvements through the exploration of 3D Gaussian Splatting (3DGS) representations, particularly in generating high-fidelity dense maps. However, they depend on a static environment assumption and experience significant performance degradation in dynamic environments. This paper presents GGD-SLAM, a framework that employs a generalizable motion model to address the challenges of localization and dense mapping in dynamic environments - without predefined semantic annotations or depth input. Specifically, the proposed system employs a First-In-First-Out (FIFO) queue to manage incoming frames, facilitating dynamic semantic feature extraction through a sequential attention mechanism. This is integrated with a dynamic feature enhancer to separate static and dynamic components. Additionally, to minimize dynamic distractors' impact on the static…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
