UAV-ON: A Benchmark for Open-World Object Goal Navigation with Aerial Agents

Jianqiang Xiao; Yuexuan Sun; Yixin Shao; Boxi Gan; Rongqiang Liu; Yanjing Wu; Weili Guan; Xiang Deng

arXiv:2508.00288·cs.RO·August 25, 2025

UAV-ON: A Benchmark for Open-World Object Goal Navigation with Aerial Agents

Jianqiang Xiao, Yuexuan Sun, Yixin Shao, Boxi Gan, Rongqiang Liu, Yanjing Wu, Weili Guan, Xiang Deng

PDF

Open Access

TL;DR

UAV-ON introduces a comprehensive benchmark for aerial agents to perform large-scale object goal navigation in complex open-world environments using semantic goals, challenging current methods and advancing UAV autonomy research.

Contribution

The paper presents UAV-ON, a new benchmark with diverse environments, semantic instructions, and baseline evaluations for aerial object goal navigation without relying on detailed instructions.

Findings

01

Baselines struggle with semantic grounding in aerial navigation.

02

UAV-ON covers urban, natural, and mixed environments.

03

Challenges highlight the need for improved semantic reasoning in UAVs.

Abstract

Aerial navigation is a fundamental yet underexplored capability in embodied intelligence, enabling agents to operate in large-scale, unstructured environments where traditional navigation paradigms fall short. However, most existing research follows the Vision-and-Language Navigation (VLN) paradigm, which heavily depends on sequential linguistic instructions, limiting its scalability and autonomy. To address this gap, we introduce UAV-ON, a benchmark for large-scale Object Goal Navigation (ObjectNav) by aerial agents in open-world environments, where agents operate based on high-level semantic goals without relying on detailed instructional guidance as in VLN. UAV-ON comprises 14 high-fidelity Unreal Engine environments with diverse semantic regions and complex spatial layouts, covering urban, natural, and mixed-use settings. It defines 1270 annotated target objects, each characterized…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Advanced Neural Network Applications · Reinforcement Learning in Robotics