# Building an Efficient Intrusion Detection System Based on Feature   Selection and Ensemble Classifier

**Authors:** Yuyang Zhou, Guang Cheng, Shanqing Jiang, Mian Dai

arXiv: 1904.01352 · 2020-04-03

## TL;DR

This paper presents a new intrusion detection system that combines feature selection with ensemble learning to improve detection accuracy and adaptability against various network attacks.

## Contribution

It introduces a novel heuristic feature selection algorithm CFS-BA and an ensemble classifier combining C4.5, RF, and Forest PA for enhanced intrusion detection.

## Key findings

- Outperforms existing methods on NSL-KDD, AWID, CIC-IDS2017 datasets.
- Reduces feature dimensionality effectively, improving detection speed.
- Achieves higher accuracy and robustness in attack recognition.

## Abstract

Intrusion detection system (IDS) is one of extensively used techniques in a network topology to safeguard the integrity and availability of sensitive assets in the protected systems. Although many supervised and unsupervised learning approaches from the field of machine learning have been used to increase the efficacy of IDSs, it is still a problem for existing intrusion detection algorithms to achieve good performance. First, lots of redundant and irrelevant data in high-dimensional datasets interfere with the classification process of an IDS. Second, an individual classifier may not perform well in the detection of each type of attacks. Third, many models are built for stale datasets, making them less adaptable for novel attacks. Thus, we propose a new intrusion detection framework in this paper, and this framework is based on the feature selection and ensemble learning techniques. In the first step, a heuristic algorithm called CFS-BA is proposed for dimensionality reduction, which selects the optimal subset based on the correlation between features. Then, we introduce an ensemble approach that combines C4.5, Random Forest (RF), and Forest by Penalizing Attributes (Forest PA) algorithms. Finally, voting technique is used to combine the probability distributions of the base learners for attack recognition. The experimental results, using NSL-KDD, AWID, and CIC-IDS2017 datasets, reveal that the proposed CFS-BA-Ensemble method is able to exhibit better performance than other related and state of the art approaches under several metrics.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1904.01352/full.md

## Figures

32 figures with captions in the complete paper: https://tomesphere.com/paper/1904.01352/full.md

## References

99 references — full list in the complete paper: https://tomesphere.com/paper/1904.01352/full.md

---
Source: https://tomesphere.com/paper/1904.01352