Malware Traffic Classification: Evaluation of Algorithms and an Automated Ground-truth Generation Pipeline
Syed Muhammad Kumail Raza, Juan Caballero

TL;DR
This paper evaluates various algorithms for classifying encrypted malware traffic and introduces an automated pipeline for generating labeled ground-truth data using observable meta-data, aiding in model evaluation.
Contribution
It proposes a semi-supervised malware classification pipeline utilizing observable meta-data and an automated method for generating labeled datasets for evaluation.
Findings
Different clustering approaches tested for malware classification.
Automated ground-truth generation pipeline developed.
Framework aids in evaluating detection models.
Abstract
Identifying threats in a network traffic flow which is encrypted is uniquely challenging. On one hand it is extremely difficult to simply decrypt the traffic due to modern encryption algorithms. On the other hand, passing such an encrypted stream through pattern matching algorithms is useless because encryption ensures there aren't any. Moreover, evaluating such models is also difficult due to lack of labeled benign and malware datasets. Other approaches have tried to tackle this problem by employing observable meta-data gathered from the flow. We try to augment this approach by extending it to a semi-supervised malware classification pipeline using these observable meta-data. To this end, we explore and test different kind of clustering approaches which make use of unique and diverse set of features extracted from this observable meta-data. We also, propose an automated packet…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNetwork Security and Intrusion Detection · Internet Traffic Analysis and Secure E-voting · Advanced Malware Detection Techniques
