MonoEM-GS: Monocular Expectation-Maximization Gaussian Splatting SLAM
Evgenii Kruzhkov, Sven Behnke

TL;DR
MonoEM-GS is a novel monocular SLAM pipeline that stabilizes geometric predictions using Expectation-Maximization and Gaussian Splatting, enabling accurate mapping and downstream queries.
Contribution
It introduces a new method combining Gaussian Splatting with EM and ICP to improve monocular SLAM stability and enable open-set segmentation.
Findings
Outperforms recent baselines on 7-Scenes, TUM RGB-D, and Replica datasets.
Effectively stabilizes geometry predictions across viewpoints and transformations.
Enables in-place segmentation and downstream queries directly on the reconstructed map.
Abstract
Feed-forward geometric foundation models can infer dense point clouds and camera motion directly from RGB streams, providing priors for monocular SLAM. However, their predictions are often view-dependent and noisy: geometry can vary across viewpoints and under image transformations, and local metric properties may drift between frames. We present MonoEM-GS, a monocular mapping pipeline that integrates such geometric predictions into a global Gaussian Splatting representation while explicitly addressing these inconsistencies. MonoEM-GS couples Gaussian Splatting with an Expectation--Maximization formulation to stabilize geometry, and employs ICP-based alignment for monocular pose estimation. Beyond geometry, MonoEM-GS parameterizes Gaussians with multi-modal features, enabling in-place open-set segmentation and other downstream queries directly on the reconstructed map. We evaluate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
