A Scalable Decentralized Reinforcement Learning Framework for UAV Target Localization Using Recurrent PPO
Leon Fernando, Billy Pik Lik Lau, Chau Yuen, U-Xuan Tan

TL;DR
This paper presents a scalable decentralized reinforcement learning framework using Recurrent PPO for UAV target localization in challenging environments, demonstrating high accuracy and efficiency in multi-drone coordination.
Contribution
The study introduces a novel decentralized Recurrent PPO approach for UAV target localization, effective in environments without GPS signals, with a two-drone model reducing localization steps.
Findings
Single-drone accuracy of 93%
Two-drone accuracy of 86% with fewer steps
Effective in GPS-denied environments
Abstract
The rapid advancements in unmanned aerial vehicles (UAVs) have unlocked numerous applications, including environmental monitoring, disaster response, and agricultural surveying. Enhancing the collective behavior of multiple decentralized UAVs can significantly improve these applications through more efficient and coordinated operations. In this study, we explore a Recurrent PPO model for target localization in perceptually degraded environments like places without GNSS/GPS signals. We first developed a single-drone approach for target identification, followed by a decentralized two-drone model. Our approach can utilize two types of sensors on the UAVs, a detection sensor and a target signal sensor. The single-drone model achieved an accuracy of 93%, while the two-drone model achieved an accuracy of 86%, with the latter requiring fewer average steps to locate the target. This…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotics and Sensor-Based Localization · Distributed Control Multi-Agent Systems · Target Tracking and Data Fusion in Sensor Networks
MethodsEntropy Regularization · Proximal Policy Optimization
