Asynchronous Curriculum Experience Replay: A Deep Reinforcement Learning   Approach for UAV Autonomous Motion Control in Unknown Dynamic Environments

Zijian Hu; Xiaoguang Gao; Kaifang Wan; Qianglong Wang; Yiwei Zhai

arXiv:2207.01251·cs.AI·July 5, 2022·1 cites

Asynchronous Curriculum Experience Replay: A Deep Reinforcement Learning Approach for UAV Autonomous Motion Control in Unknown Dynamic Environments

Zijian Hu, Xiaoguang Gao, Kaifang Wan, Qianglong Wang, Yiwei Zhai

PDF

Open Access

TL;DR

This paper introduces an advanced deep reinforcement learning method called ACER for UAV autonomous motion control in complex dynamic 3D environments, improving learning efficiency and robustness.

Contribution

It proposes an asynchronous curriculum experience replay (ACER) that enhances experience sampling and prioritization, integrating curriculum learning for better UAV control in unknown environments.

Findings

01

ACER speeds up convergence by 24.66%.

02

Achieves 5.59% better convergence results than TD3.

03

Demonstrates strong robustness and generalization in various environments.

Abstract

Unmanned aerial vehicles (UAVs) have been widely used in military warfare. In this paper, we formulate the autonomous motion control (AMC) problem as a Markov decision process (MDP) and propose an advanced deep reinforcement learning (DRL) method that allows UAVs to execute complex tasks in large-scale dynamic three-dimensional (3D) environments. To overcome the limitations of the prioritized experience replay (PER) algorithm and improve performance, the proposed asynchronous curriculum experience replay (ACER) uses multithreads to asynchronously update the priorities, assigns the true priorities and applies a temporary experience pool to make available experiences of higher quality for learning. A first-in-useless-out (FIUO) experience pool is also introduced to ensure the higher use value of the stored experiences. In addition, combined with curriculum learning (CL), a more reasonable…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Surveillance and Tracking Methods · Reinforcement Learning in Robotics · Human Pose and Action Recognition

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Dense Connections · Retrace · Trust Region Policy Optimization · Stochastic Dueling Network · *Communicated@Fast*How Do I Communicate to Expedia? · Convolution · Softmax · Entropy Regularization · ACER