# Intelligent model for the detection and classification of encrypted network traffic in cloud infrastructure

**Authors:** Muhammad Dawood, Chunagbai Xiao, Shanshan Tu, Faiz Abdullah Alotaibi, Mrim M. Alnfiai, Muhammad Farhan

PMC · DOI: 10.7717/peerj-cs.2027 · 2024-05-27

## TL;DR

The paper presents a machine learning approach to detect and classify encrypted DNS traffic, aiming to improve cybersecurity in cloud environments.

## Contribution

The study introduces a novel ML-based classification method for encrypted DNS traffic detection and evaluation of model performance.

## Key findings

- The AdaBoost model achieved 75% accuracy for malicious traffic and 73% for DoH traffic.
- The QDA model showed high accuracy of 99% for malicious and 98% for non-DoH traffic.
- SVC-RBF model reached 76% accuracy in distinguishing malicious from non-DoH traffic.

## Abstract

This article explores detecting and categorizing network traffic data using machine-learning (ML) methods, specifically focusing on the Domain Name Server (DNS) protocol. DNS has long been susceptible to various security flaws, frequently exploited over time, making DNS abuse a major concern in cybersecurity. Despite advanced attack, tactics employed by attackers to steal data in real-time, ensuring security and privacy for DNS queries and answers remains challenging. The evolving landscape of internet services has allowed attackers to launch cyber-attacks on computer networks. However, implementing Secure Socket Layer (SSL)-encrypted Hyper Text Transfer Protocol (HTTP) transmission, known as HTTPS, has significantly reduced DNS-based assaults. To further enhance security and mitigate threats like man-in-the-middle attacks, the security community has developed the concept of DNS over HTTPS (DoH). DoH aims to combat the eavesdropping and tampering of DNS data during communication. This study employs a ML-based classification approach on a dataset for traffic analysis. The AdaBoost model effectively classified Malicious and Non-DoH traffic, with accuracies of 75% and 73% for DoH traffic. The support vector classification model with a Radial Basis Function (SVC-RBF) achieved a 76% accuracy in classifying between malicious and non-DoH traffic. The quadratic discriminant analysis (QDA) model achieved 99% accuracy in classifying malicious traffic and 98% in classifying non-DoH traffic.

## Full-text entities

- **Diseases:** ML (MESH:D007859), IDS (MESH:C537310), DoH (MESH:D006963)
- **Chemicals:** DNS (-), Iodine (MESH:D007455), TCP (MESH:C049563)
- **Species:** Homo sapiens (human, species) [taxon 9606]
- **Mutations:** H in F

## Figures

50 figures with captions in the complete paper: https://tomesphere.com/paper/PMC11157524/full.md

---
Source: https://tomesphere.com/paper/PMC11157524