Solving reward-collecting problems with UAVs: a comparison of online   optimization and Q-learning

Yixuan Liu; Chrysafis Vogiatzis; Ruriko Yoshida; Erich Morman

arXiv:2112.00141·cs.LG·December 2, 2021

Solving reward-collecting problems with UAVs: a comparison of online optimization and Q-learning

Yixuan Liu, Chrysafis Vogiatzis, Ruriko Yoshida, Erich Morman

PDF

Open Access 1 Repo

TL;DR

This paper compares online optimization and Q-learning methods for UAV path planning in reward-collecting scenarios with adversaries, demonstrating their effectiveness and efficiency in grid-world simulations.

Contribution

It introduces a framework for UAV path optimization considering adversaries and compares three different solution methods, including deep and tabular Q-learning and online optimization.

Findings

01

Deep Q-Learning performs well in complex environments.

02

Online optimization offers faster computation times.

03

Tabular Q-Learning is effective in simpler scenarios.

Abstract

Uncrewed autonomous vehicles (UAVs) have made significant contributions to reconnaissance and surveillance missions in past US military campaigns. As the prevalence of UAVs increases, there has also been improvements in counter-UAV technology that makes it difficult for them to successfully obtain valuable intelligence within an area of interest. Hence, it has become important that modern UAVs can accomplish their missions while maximizing their chances of survival. In this work, we specifically study the problem of identifying a short path from a designated start to a goal, while collecting all rewards and avoiding adversaries that move randomly on the grid. We also provide a possible application of the framework in a military setting, that of autonomous casualty evacuation. We present a comparison of three methods to solve this problem: namely we implement a Deep Q-Learning model, an…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

benliu31492/solving-reward-collecting-problems-with-uavs-a-comparison-of-online-optimization-and-q-learning
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsOptimization and Search Problems · Reinforcement Learning in Robotics · Robotic Path Planning Algorithms

MethodsQ-Learning