TL;DR
BEDI is a comprehensive benchmark for evaluating UAV-embodied agents, introducing a standardized framework, diverse scenarios, and open interfaces to advance research in autonomous UAV tasks.
Contribution
The paper presents BEDI, a novel benchmark with a dynamic task paradigm, unified evaluation of core skills, and a hybrid testing platform for UAV-embodied agents.
Findings
State-of-the-art VLMs show limitations in UAV tasks.
BEDI enables objective comparison of UAV-embodied models.
The benchmark supports diverse virtual and real-world scenarios.
Abstract
With the rapid advancement of low-altitude remote sensing and Vision-Language Models (VLMs), Embodied Agents based on Unmanned Aerial Vehicles (UAVs) have shown significant potential in autonomous tasks. However, current evaluation methods for UAV-Embodied Agents (UAV-EAs) remain constrained by the lack of standardized benchmarks, diverse testing scenarios and open system interfaces. To address these challenges, we propose BEDI (Benchmark for Embodied Drone Intelligence), a systematic and standardized benchmark designed for evaluating UAV-EAs. Specifically, we introduce a novel Dynamic Chain-of-Embodied-Task paradigm based on the perception-decision-action loop, which decomposes complex UAV tasks into standardized, measurable subtasks. Building on this paradigm, we design a unified evaluation framework encompassing six core sub-skills: semantic perception, spatial perception, motion…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
