ESARBench: A Benchmark for Agentic UAV Embodied Search and Rescue

Daoxuan Zhang; Ping Chen; Jianyi Zhou; Shuo Yang

arXiv:2605.01371·cs.RO·May 5, 2026

ESARBench: A Benchmark for Agentic UAV Embodied Search and Rescue

Daoxuan Zhang, Ping Chen, Jianyi Zhou, Shuo Yang

PDF

2 Repos

TL;DR

ESARBench introduces a comprehensive, realistic benchmark for evaluating multimodal large language model-driven UAV agents in complex search and rescue scenarios, addressing a key gap in the field.

Contribution

The paper proposes the novel ESAR task and presents the first high-fidelity, GIS-mapped benchmark for UAV search and rescue, including diverse environments and evaluation metrics.

Findings

01

Experimental results reveal challenges in spatial memory and aerial adaptation.

02

Trade-offs identified between search efficiency and flight safety.

03

Baseline evaluations highlight bottlenecks in current UAV SAR methods.

Abstract

The rapid advancement of Multimodal Large Language Models (MLLMs) has empowered Unmanned Aerial Vehicle (UAV) with exceptional capabilities in spatial reasoning, semantic understanding, and complex decision-making, making them inherently suited for UAV Search and Rescue (SAR). However, existing UAV SAR research is dominated by traditional vision and path-planning methods and lacks a comprehensive and unified benchmark for embodied agents. To bridge this gap, we first propose the novel task of \textbf{Embodied Search and Rescue (ESAR)}, which requires aerial agents to autonomously explore complex environments, identify rescue clues, and reason about victim locations to execute informed decision-making. Additionally, we present \textbf{ESARBench}, the first comprehensive benchmark designed to evaluate MLLM-driven UAV agents in highly realistic SAR scenarios. Leveraging Unreal Engine 5 and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.