WildCross: A Cross-Modal Large Scale Benchmark for Place Recognition and Metric Depth Estimation in Natural Environments

Joshua Knights; Joseph Reid; Kaushik Roy; David Hall; Mark Cox; Peyman Moghadam

arXiv:2603.01475·cs.CV·March 3, 2026

WildCross: A Cross-Modal Large Scale Benchmark for Place Recognition and Metric Depth Estimation in Natural Environments

Joshua Knights, Joseph Reid, Kaushik Roy, David Hall, Mark Cox, Peyman Moghadam

PDF

Open Access 1 Models

TL;DR

WildCross introduces a large-scale, cross-modal benchmark dataset for place recognition and depth estimation in unstructured natural environments, addressing a gap in existing urban-focused datasets.

Contribution

It provides a comprehensive dataset with over 476K frames, aligned with 6DoF poses and lidar data, enabling robust evaluation of multi-modal perception in natural settings.

Findings

01

Demonstrates the dataset's effectiveness for multi-modal perception tasks.

02

Shows challenges of natural environments for place recognition and depth estimation.

03

Establishes baseline results for various perception tasks.

Abstract

Recent years have seen a significant increase in demand for robotic solutions in unstructured natural environments, alongside growing interest in bridging 2D and 3D scene understanding. However, existing robotics datasets are predominantly captured in structured urban environments, making them inadequate for addressing the challenges posed by complex, unstructured natural settings. To address this gap, we propose WildCross, a cross-modal benchmark for place recognition and metric depth estimation in large-scale natural environments. WildCross comprises over 476K sequential RGB frames with semi-dense depth and surface normal annotations, each aligned with accurate 6DoF poses and synchronized dense lidar submaps. We conduct comprehensive experiments on visual, lidar, and cross-modal place recognition, as well as metric depth estimation, demonstrating the value of WildCross as a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
CSIRORobotics/WildCross
model

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobotics and Sensor-Based Localization · Advanced Vision and Imaging · Advanced Neural Network Applications