Spatiotemporal Contrastive Learning for Cross-View Video Localization in Unstructured Off-road Terrains
Zhiyun Deng, Dongmyeong Lee, Amanda Adkins, Jesse Quattrociocchi, Christian Ellis, Joydeep Biswas

TL;DR
This paper introduces MoViX, a self-supervised framework for cross-view video localization in challenging off-road environments, achieving high accuracy despite seasonal and perceptual ambiguities.
Contribution
MoViX is the first to learn viewpoint- and season-invariant representations for off-road localization using a self-supervised approach with novel sampling and aggregation strategies.
Findings
Achieves 93% within 25 meters accuracy on TartanDrive 2.0
Operates effectively with limited training data (under 30 minutes)
Generalizes well to new environments and different robot platforms
Abstract
Robust cross-view 3-DoF localization in GPS-denied, off-road environments remains challenging due to (1) perceptual ambiguities from repetitive vegetation and unstructured terrain, and (2) seasonal shifts that significantly alter scene appearance, hindering alignment with outdated satellite imagery. To address this, we introduce MoViX, a self-supervised cross-view video localization framework that learns viewpoint- and season-invariant representations while preserving directional awareness essential for accurate localization. MoViX employs a pose-dependent positive sampling strategy to enhance directional discrimination and temporally aligned hard negative mining to discourage shortcut learning from seasonal cues. A motion-informed frame sampler selects spatially diverse frames, and a lightweight temporal aggregator emphasizes geometrically aligned observations while downweighting…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotics and Sensor-Based Localization · Advanced Neural Network Applications · Indoor and Outdoor Localization Technologies
