Maneuver Decision-Making For Autonomous Air Combat Through Curriculum Learning And Reinforcement Learning With Sparse Rewards
Yu-Jie Wei, Hong-Peng Zhang, Chang-Qiang Huang

TL;DR
This paper introduces a curriculum learning approach combined with reinforcement learning to improve autonomous air combat maneuver decision-making, addressing training efficiency and performance issues with sparse rewards.
Contribution
It designs three curricula for air combat decision-making and demonstrates their effects on training speed, stability, and agent performance, highlighting the benefits of angle and distance curricula.
Findings
Angle curriculum improves training speed and stability.
Distance curriculum enhances training efficiency.
Hybrid curriculum causes agents to get stuck at local optima.
Abstract
Reinforcement learning is an effective way to solve the decision-making problems. It is a meaningful and valuable direction to investigate autonomous air combat maneuver decision-making method based on reinforcement learning. However, when using reinforcement learning to solve the decision-making problems with sparse rewards, such as air combat maneuver decision-making, it costs too much time for training and the performance of the trained agent may not be satisfactory. In order to solve these problems, the method based on curriculum learning is proposed. First, three curricula of air combat maneuver decision-making are designed: angle curriculum, distance curriculum and hybrid curriculum. These courses are used to train air combat agents respectively, and compared with the original method without any curriculum. The training results show that angle curriculum can increase the speed and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGuidance and Control Systems · Aerospace and Aviation Technology · Military Defense Systems Analysis
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
