AirZoo: A Unified Large-Scale Dataset for Grounding Aerial Geometric 3D Vision

Xiaoya Cheng; Rouwan Wu; Xinyi Liu; Zeyu Cui; Yan Liu; Na Zhao; Yu Liu; Maojun Zhang; Shen Yan

arXiv:2604.26567·cs.CV·April 30, 2026

AirZoo: A Unified Large-Scale Dataset for Grounding Aerial Geometric 3D Vision

Xiaoya Cheng, Rouwan Wu, Xinyi Liu, Zeyu Cui, Yan Liu, Na Zhao, Yu Liu, Maojun Zhang, Shen Yan

PDF

TL;DR

AirZoo introduces a large-scale, diverse dataset with rich geometric annotations for aerial 3D vision, enabling improved model training and benchmarking in UAV-based sensing.

Contribution

It provides a scalable generation pipeline, extensive scene diversity, and precise geometric annotations, filling a critical gap in aerial geometric 3D vision datasets.

Findings

01

Fine-tuning on AirZoo improves state-of-the-art model performance.

02

AirZoo enables effective pre-training for aerial spatial tasks.

03

Experiments show substantial performance gains across multiple benchmarks.

Abstract

Despite the rapid progress in data-driven 3D vision, aerial geometric 3D vision remains a formidable challenge due to the severe scarcity of large-scale, high-fidelity training data. Existing benchmarks, predominantly biased toward ground-level or object-centric views, do not account for complex viewpoint transformations and diverse environmental conditions in UAV-based sensing. To bridge this critical gap, we propose AirZoo, a unified large-scale dataset and benchmark for grounding aerial geometric 3D vision. AirZoo possesses three appealing properties: 1) Scalable Generation Pipeline: Leveraging freely available, world-scale photogrammetric 3D meshes, it renders vast outdoor environments with customizable UAV flight trajectories and configurable weather/illumination. 2) Comprehensive Scene Diversity: It provides the most extensive coverage of region types to date (spanning 378 regions…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.