TL;DR
This paper introduces CAL2M, a calibration-free SLAM framework that leverages visual geometry foundation models and an assistant eye to achieve kilometer-level, globally consistent mapping without prior calibration.
Contribution
CAL2M provides a novel, plug-and-play SLAM approach that eliminates the need for pre-calibration and models complex geometric distortions with nonlinear transformations.
Findings
Eliminates scale ambiguity without pre-calibration.
Effectively corrects rotation and translation errors via epipolar-guided correction.
Ensures globally consistent large-scale mapping through anchor propagation.
Abstract
Visual Geometry Foundation Models (VGFMs) demonstrate remarkable zero-shot capabilities in local reconstruction. However, deploying them for kilometer-level Simultaneous Localization and Mapping (SLAM) remains challenging. In such scenarios, current approaches mainly rely on linear transforms (e.g., Sim3 and SL4) for sub-map alignment, while we argue that a single linear transform is fundamentally insufficient to model the complex, non-linear geometric distortions inherent in VGFM outputs. Forcing such rigid alignment leads to the rapid accumulation of uncorrected residuals, eventually resulting in significant trajectory drift and map divergence. To address these limitations, we present CAL2M (Calibration-free Assistant-eye based Large-scale Localization and Mapping), a plug-and-play framework compatible with arbitrary VGFMs. Distinct from traditional systems, CAL2M introduces an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
