AirZoo: A Unified Large-Scale Dataset for Grounding Aerial Geometric 3D Vision
Xiaoya Cheng, Rouwan Wu, Xinyi Liu, Zeyu Cui, Yan Liu, Na Zhao, Yu Liu, Maojun Zhang, Shen Yan

TL;DR
AirZoo introduces a large-scale, diverse dataset with rich geometric annotations for aerial 3D vision, enabling improved model training and benchmarking in UAV-based sensing.
Contribution
It provides a scalable generation pipeline, extensive scene diversity, and precise geometric annotations, filling a critical gap in aerial geometric 3D vision datasets.
Findings
Fine-tuning on AirZoo improves state-of-the-art model performance.
AirZoo enables effective pre-training for aerial spatial tasks.
Experiments show substantial performance gains across multiple benchmarks.
Abstract
Despite the rapid progress in data-driven 3D vision, aerial geometric 3D vision remains a formidable challenge due to the severe scarcity of large-scale, high-fidelity training data. Existing benchmarks, predominantly biased toward ground-level or object-centric views, do not account for complex viewpoint transformations and diverse environmental conditions in UAV-based sensing. To bridge this critical gap, we propose AirZoo, a unified large-scale dataset and benchmark for grounding aerial geometric 3D vision. AirZoo possesses three appealing properties: 1) Scalable Generation Pipeline: Leveraging freely available, world-scale photogrammetric 3D meshes, it renders vast outdoor environments with customizable UAV flight trajectories and configurable weather/illumination. 2) Comprehensive Scene Diversity: It provides the most extensive coverage of region types to date (spanning 378 regions…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
