StealthCup: Realistic, Multi-Stage, Evasion-Focused CTF for Benchmarking IDS
Manuel Kern, Dominik Steffan, Felix Schuster, Florian Skopik, Max Landauer, David Allison, Simon Freudenthaler, Edgar Weippl

TL;DR
StealthCup introduces a realistic, multi-stage, evasion-focused CTF framework for benchmarking IDS, revealing significant detection gaps and providing a reproducible, attacker-aligned evaluation methodology.
Contribution
This paper presents StealthCup, a novel multi-stage attack benchmarking approach that simulates realistic adversary behavior to evaluate IDS effectiveness more accurately.
Findings
11 out of 32 attack techniques went undetected by all IDS configurations.
Open-source IDS systems had false positives >90%.
Commercial IDS missed more attacks but had fewer false positives.
Abstract
Intrusion Detection Systems (IDS) are critical to defending enterprise and industrial control environments, yet evaluating their effectiveness under realistic conditions remains an open challenge. Existing benchmarks rely on synthetic datasets (e.g., NSL-KDD, CICIDS2017) or scripted replay frameworks, which fail to capture adaptive adversary behavior. Even MITRE ATT&CK Evaluations, while influential, are host-centric and assume malware-driven compromise, thereby under-representing stealthy, multi-stage intrusions across IT and OT domains. We present StealthCup, a novel evaluation methodology that operationalizes IDS benchmarking as an evasion-focused Capture-the-Flag competition. Professional penetration testers engaged in multi-stage attack chains on a realistic IT/OT testbed, with scoring penalizing IDS detections. The event generated structured attacker writeups, validated…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNetwork Security and Intrusion Detection · Information and Cyber Security · Security and Verification in Computing
