Deep Reinforcement Learning-based UAV Navigation and Control: A Soft Actor-Critic with Hindsight Experience Replay Approach
Myoung Hoon Lee, Jun Moon

TL;DR
This paper introduces SACHER, a novel deep reinforcement learning algorithm combining SAC and HER, which enhances UAV navigation and control by achieving faster, more accurate learning of optimal paths amidst obstacles.
Contribution
The paper proposes SACHER, integrating HER with SAC to improve sample efficiency and learning performance in UAV navigation and control tasks.
Findings
SACHER outperforms SAC and DDPG in UAV navigation tasks.
SACHER achieves lower tracking error and higher cumulative reward.
The method is applicable to various UAV models.
Abstract
In this paper, we propose SACHER (soft actor-critic (SAC) with hindsight experience replay (HER)), which constitutes a class of deep reinforcement learning (DRL) algorithms. SAC is known as an off-policy model-free DRL algorithm based on the maximum entropy framework, which outperforms earlier DRL algorithms in terms of exploration, robustness and learning performance. However, in SAC, maximizing the entropy-augmented objective may degrade the optimality of learning outcomes. HER is known as a sample-efficient replay method that enhances the performance of off-policy DRL algorithms by allowing the agent to learn from both failures and successes. We apply HER to SAC and propose SACHER to improve the learning performance of SAC. More precisely, SACHER achieves the desired optimal outcomes faster and more accurately than SAC, since HER improves the sample efficiency of SAC. We apply SACHER…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Robotic Path Planning Algorithms
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Adam · Batch Normalization · Convolution · Average Pooling · Global Average Pooling · Dense Connections · 1x1 Convolution · Dilated Convolution · Weight Decay
