Solving reward-collecting problems with UAVs: a comparison of online optimization and Q-learning
Yixuan Liu, Chrysafis Vogiatzis, Ruriko Yoshida, Erich Morman

TL;DR
This paper compares online optimization and Q-learning methods for UAV path planning in reward-collecting scenarios with adversaries, demonstrating their effectiveness and efficiency in grid-world simulations.
Contribution
It introduces a framework for UAV path optimization considering adversaries and compares three different solution methods, including deep and tabular Q-learning and online optimization.
Findings
Deep Q-Learning performs well in complex environments.
Online optimization offers faster computation times.
Tabular Q-Learning is effective in simpler scenarios.
Abstract
Uncrewed autonomous vehicles (UAVs) have made significant contributions to reconnaissance and surveillance missions in past US military campaigns. As the prevalence of UAVs increases, there has also been improvements in counter-UAV technology that makes it difficult for them to successfully obtain valuable intelligence within an area of interest. Hence, it has become important that modern UAVs can accomplish their missions while maximizing their chances of survival. In this work, we specifically study the problem of identifying a short path from a designated start to a goal, while collecting all rewards and avoiding adversaries that move randomly on the grid. We also provide a possible application of the framework in a military setting, that of autonomous casualty evacuation. We present a comparison of three methods to solve this problem: namely we implement a Deep Q-Learning model, an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOptimization and Search Problems · Reinforcement Learning in Robotics · Robotic Path Planning Algorithms
MethodsQ-Learning
