Model-aided Deep Reinforcement Learning for Sample-efficient UAV Trajectory Design in IoT Networks
Omid Esrafilian, Harald Bayerlein, and David Gesbert

TL;DR
This paper introduces a model-aided deep Q-learning method that significantly reduces training data requirements for UAV trajectory optimization in IoT networks, enabling efficient data collection without prior channel knowledge.
Contribution
It proposes a novel model-aided DRL approach using environment modeling and anchor nodes to cut down training data needs for UAV trajectory design.
Findings
Requires at least ten times fewer training samples than standard DRL methods.
Achieves comparable data collection performance with limited prior knowledge.
Demonstrates practical viability for real-world UAV IoT applications.
Abstract
Deep Reinforcement Learning (DRL) is gaining attention as a potential approach to design trajectories for autonomous unmanned aerial vehicles (UAV) used as flying access points in the context of cellular or Internet of Things (IoT) connectivity. DRL solutions offer the advantage of on-the-go learning hence relying on very little prior contextual information. A corresponding drawback however lies in the need for many learning episodes which severely restricts the applicability of such approach in real-world time- and energy-constrained missions. Here, we propose a model-aided deep Q-learning approach that, in contrast to previous work, considerably reduces the need for extensive training data samples, while still achieving the overarching goal of DRL, i.e to guide a battery-limited UAV on an efficient data harvesting trajectory, without prior knowledge of wireless channel characteristics…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsUAV Applications and Optimization · Energy Harvesting in Wireless Networks · Indoor and Outdoor Localization Technologies
MethodsQ-Learning
