Homotopy Based Reinforcement Learning with Maximum Entropy for   Autonomous Air Combat

Yiwen Zhu; Zhou Fang; Yuan Zheng; Wenya Wei

arXiv:2112.01328·cs.LG·December 3, 2021

Homotopy Based Reinforcement Learning with Maximum Entropy for Autonomous Air Combat

Yiwen Zhu, Zhou Fang, Yuan Zheng, Wenya Wei

PDF

Open Access

TL;DR

This paper introduces a homotopy-based reinforcement learning method with maximum entropy for autonomous air combat, improving decision-making speed and convergence in high-dynamics scenarios by bridging sparse and artificial reward tasks.

Contribution

The paper proposes a novel homotopy-based soft actor-critic method (HSAC) that enhances RL convergence and performance in complex air combat tasks by combining sparse and artificial rewards.

Findings

01

Achieved over 98.3% win rate in attack tasks

02

Improved convergence speed over traditional RL methods

03

Demonstrated effectiveness in 3D air combat simulation environments

Abstract

The Intelligent decision of the unmanned combat aerial vehicle (UCAV) has long been a challenging problem. The conventional search method can hardly satisfy the real-time demand during high dynamics air combat scenarios. The reinforcement learning (RL) method can significantly shorten the decision time via using neural networks. However, the sparse reward problem limits its convergence speed and the artificial prior experience reward can easily deviate its optimal convergent direction of the original task, which raises great difficulties for the RL air combat application. In this paper, we propose a homotopy-based soft actor-critic method (HSAC) which focuses on addressing these problems via following the homotopy path between the original task with sparse reward and the auxiliary task with artificial prior experience reward. The convergence and the feasibility of this method are also…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGuidance and Control Systems · Artificial Intelligence in Games · Reinforcement Learning in Robotics

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings