Maintainable Log Datasets for Evaluation of Intrusion Detection Systems

Max Landauer; Florian Skopik; Maximilian Frank; Wolfgang Hotwagner,; Markus Wurzenberger; Andreas Rauber

arXiv:2203.08580·cs.CR·May 16, 2023

Maintainable Log Datasets for Evaluation of Intrusion Detection Systems

Max Landauer, Florian Skopik, Maximilian Frank, Wolfgang Hotwagner,, Markus Wurzenberger, Andreas Rauber

PDF

2 Repos

TL;DR

This paper introduces a set of maintainable, labeled log datasets generated in a testbed environment for evaluating intrusion detection systems, addressing the lack of publicly available, reproducible datasets.

Contribution

It presents a scalable, model-driven approach to generate and label diverse intrusion detection datasets in a controlled testbed environment.

Findings

01

8 datasets with 20 log file types provided

02

Labeled 8 files with 10 attack steps each

03

Open-source code and datasets published online

Abstract

Intrusion detection systems (IDS) monitor system logs and network traffic to recognize malicious activities in computer networks. Evaluating and comparing IDSs with respect to their detection accuracies is thereby essential for their selection in specific use-cases. Despite a great need, hardly any labeled intrusion detection datasets are publicly available. As a consequence, evaluations are often carried out on datasets from real infrastructures, where analysts cannot control system parameters or generate a reliable ground truth, or private datasets that prevent reproducibility of results. As a solution, we present a collection of maintainable log datasets collected in a testbed representing a small enterprise. Thereby, we employ extensive state machines to simulate normal user behavior and inject a multi-step attack. For scalable testbed deployment, we use concepts from model-driven…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.