LogisticsVLN: Vision-Language Navigation For Low-Altitude Terminal Delivery Based on Agentic UAVs
Xinyuan Zhang, Yonglin Tian, Fei Lin, Yue Liu, Jing Ma, Korn\'elia, S\'ara Szatm\'ary, Fei-Yue Wang

TL;DR
LogisticsVLN introduces a novel UAV-based vision-language navigation system for precise terminal delivery, utilizing multimodal large language models and a new dataset to demonstrate feasibility and guide future improvements.
Contribution
The paper presents LogisticsVLN, a scalable aerial delivery system integrating multimodal large language models for autonomous terminal delivery, and introduces the VLD dataset for research and evaluation.
Findings
Feasibility demonstrated on the VLD dataset
Effective modular pipeline for request understanding and action decision
Insights into robustness and deployment challenges
Abstract
The growing demand for intelligent logistics, particularly fine-grained terminal delivery, underscores the need for autonomous UAV (Unmanned Aerial Vehicle)-based delivery systems. However, most existing last-mile delivery studies rely on ground robots, while current UAV-based Vision-Language Navigation (VLN) tasks primarily focus on coarse-grained, long-range goals, making them unsuitable for precise terminal delivery. To bridge this gap, we propose LogisticsVLN, a scalable aerial delivery system built on multimodal large language models (MLLMs) for autonomous terminal delivery. LogisticsVLN integrates lightweight Large Language Models (LLMs) and Visual-Language Models (VLMs) in a modular pipeline for request understanding, floor localization, object detection, and action-decision making. To support research and evaluation in this new setting, we construct the Vision-Language Delivery…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed Control Multi-Agent Systems
