Model-aided Deep Reinforcement Learning for Sample-efficient UAV   Trajectory Design in IoT Networks

Omid Esrafilian; Harald Bayerlein; and David Gesbert

arXiv:2104.10403·cs.IT·February 7, 2022

Model-aided Deep Reinforcement Learning for Sample-efficient UAV Trajectory Design in IoT Networks

Omid Esrafilian, Harald Bayerlein, and David Gesbert

PDF

Open Access

TL;DR

This paper introduces a model-aided deep Q-learning method that significantly reduces training data requirements for UAV trajectory optimization in IoT networks, enabling efficient data collection without prior channel knowledge.

Contribution

It proposes a novel model-aided DRL approach using environment modeling and anchor nodes to cut down training data needs for UAV trajectory design.

Findings

01

Requires at least ten times fewer training samples than standard DRL methods.

02

Achieves comparable data collection performance with limited prior knowledge.

03

Demonstrates practical viability for real-world UAV IoT applications.

Abstract

Deep Reinforcement Learning (DRL) is gaining attention as a potential approach to design trajectories for autonomous unmanned aerial vehicles (UAV) used as flying access points in the context of cellular or Internet of Things (IoT) connectivity. DRL solutions offer the advantage of on-the-go learning hence relying on very little prior contextual information. A corresponding drawback however lies in the need for many learning episodes which severely restricts the applicability of such approach in real-world time- and energy-constrained missions. Here, we propose a model-aided deep Q-learning approach that, in contrast to previous work, considerably reduces the need for extensive training data samples, while still achieving the overarching goal of DRL, i.e to guide a battery-limited UAV on an efficient data harvesting trajectory, without prior knowledge of wireless channel characteristics…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsUAV Applications and Optimization · Energy Harvesting in Wireless Networks · Indoor and Outdoor Localization Technologies

MethodsQ-Learning