UAV-CodeAgents: Scalable UAV Mission Planning via Multi-Agent ReAct and Vision-Language Reasoning
Oleg Sautenkov, Yasheerah Yaqoot, Muhammad Ahsan Mustafa, Faryal Batool, Jeffrin Sam, Artem Lykov, Chih-Yung Wen, and Dzmitry Tsetserukou

TL;DR
UAV-CodeAgents introduces a scalable multi-agent framework utilizing large language and vision-language models for autonomous UAV mission planning, enabling precise target localization and dynamic environment adaptation with high success rates.
Contribution
The paper presents a novel multi-agent system combining ReAct paradigm and vision-language reasoning for UAV mission planning, including a new pixel-pointing mechanism and a benchmark dataset.
Findings
Higher reliability at lower decoding temperature (0.5)
Average mission creation time of 96.96 seconds
Success rate of 93% in large-scale scenarios
Abstract
We present UAV-CodeAgents, a scalable multi-agent framework for autonomous UAV mission generation, built on large language and vision-language models (LLMs/VLMs). The system leverages the ReAct (Reason + Act) paradigm to interpret satellite imagery, ground high-level natural language instructions, and collaboratively generate UAV trajectories with minimal human supervision. A core component is a vision-grounded, pixel-pointing mechanism that enables precise localization of semantic targets on aerial maps. To support real-time adaptability, we introduce a reactive thinking loop, allowing agents to iteratively reflect on observations, revise mission goals, and coordinate dynamically in evolving environments. UAV-CodeAgents is evaluated on large-scale mission scenarios involving industrial and environmental fire detection. Our results show that a lower decoding temperature (0.5) yields…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsUAV Applications and Optimization · Multimodal Machine Learning Applications · Advanced Neural Network Applications
