VISTA: Monocular Segmentation-Based Mapping for Appearance and View-Invariant Global Localization
Hannah Shafferman, Annika Thomas, Jouko Kinnari, Michael Ricard, Jose Nino, Jonathan How

TL;DR
VISTA is a monocular, segmentation-based localization framework that robustly aligns vehicle positions across different environments and seasons without training, achieving high recall and low memory usage.
Contribution
It introduces a novel, domain-agnostic approach combining segmentation, tracking, and geometric matching for appearance and view-invariant localization.
Findings
Up to 69% improvement in recall over baseline methods.
Maintains a compact map only 0.6% the size of baseline maps.
Capable of real-time operation on resource-constrained platforms.
Abstract
Global localization is critical for autonomous navigation, particularly in scenarios where an agent must localize within a map generated in a different session or by another agent, as agents often have no prior knowledge about the correlation between reference frames. However, this task remains challenging in unstructured environments due to appearance changes induced by viewpoint variation, seasonal changes, spatial aliasing, and occlusions -- known failure modes for traditional place recognition methods. To address these challenges, we propose VISTA (View-Invariant Segmentation-Based Tracking for Frame Alignment), a novel open-set, monocular global localization framework that combines: 1) a front-end, object-based, segmentation and tracking pipeline, followed by 2) a submap correspondence search, which exploits geometric consistencies between environment maps to align vehicle…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace recognition and analysis · Advanced Image and Video Retrieval Techniques · Robotics and Sensor-Based Localization
