# Efficient Detection of Malicious Traffic Using a Decision Tree-Based Proximal Policy Optimisation Algorithm: A Deep Reinforcement Learning Malicious Traffic Detection Model Incorporating Entropy

**Authors:** Yuntao Zhao, Deao Ma, Wei Liu

PMC · DOI: 10.3390/e26080648 · Entropy · 2024-07-30

## TL;DR

This paper introduces a new method for detecting malicious internet traffic using a combination of decision trees and deep reinforcement learning, achieving high accuracy.

## Contribution

A novel malicious traffic detection model combining decision tree-based entropy analysis and PPO deep reinforcement learning is proposed.

## Key findings

- The model achieves 99.17% binary classification accuracy on the CIC-IDS2017 dataset.
- Entropy-based feature selection improves model efficiency by removing low-contribution features.
- The PPO algorithm with an entropy regularity term enhances detection performance.

## Abstract

With the popularity of the Internet and the increase in the level of information technology, cyber attacks have become an increasingly serious problem. They pose a great threat to the security of individuals, enterprises, and the state. This has made network intrusion detection technology critically important. In this paper, a malicious traffic detection model is constructed based on a decision tree classifier of entropy and a proximal policy optimisation algorithm (PPO) of deep reinforcement learning. Firstly, the decision tree idea in machine learning is used to make a preliminary classification judgement on the dataset based on the information entropy. The importance score of each feature in the classification work is calculated and the features with lower contributions are removed. Then, it is handed over to the PPO algorithm model for detection. An entropy regularity term is introduced in the process of the PPO algorithm update. Finally, the deep reinforcement learning algorithm is used to continuously train and update the parameters during the detection process, and finally, the detection model with higher accuracy is obtained. Experiments show that the binary classification accuracy of the malicious traffic detection model based on the deep reinforcement learning PPO algorithm can reach 99.17% under the CIC-IDS2017 dataset used in this paper.

## Full-text entities

- **Diseases:** intrusion (MESH:C537310), PPO (MESH:D014897), injury to people or property (MESH:C000719191)
- **Chemicals:** DDQN (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC11353857/full.md

## Figures

12 figures with captions in the complete paper: https://tomesphere.com/paper/PMC11353857/full.md

## References

38 references — full list in the complete paper: https://tomesphere.com/paper/PMC11353857/full.md

---
Source: https://tomesphere.com/paper/PMC11353857