Early-Stage Anomaly Detection: A Study of Model Performance on Complete vs. Partial Flows
Adrian Pekar, Richard Jozsa

TL;DR
This paper evaluates how machine learning models for network security perform when trained on complete data but tested on partial flows, revealing significant performance drops and thresholds for reliable detection.
Contribution
It systematically compares model performance on complete versus partial network flows, highlighting the impact of incomplete data on detection accuracy.
Findings
Performance drops up to 30% with partial data
A minimum of 7 packets is needed for reliable detection
Models trained on complete flows struggle with partial flow testing
Abstract
This study investigates the efficacy of machine learning models in network security threat detection through the critical lens of partial versus complete flow information, addressing a common gap between research settings and real-time operational needs. We systematically evaluate how a standard benchmark model, Random Forest, performs under varying training and testing conditions (complete/complete, partial/partial, complete/partial), quantifying the performance impact when dealing with the incomplete data typical in real-time environments. Our findings demonstrate a significant performance difference, with precision and recall dropping by up to 30% under certain conditions when models trained on complete flows are tested against partial flows. The study also reveals that, for the evaluated dataset and model, a minimum threshold around 7 packets in the test set appears necessary for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Data Stream Mining Techniques · Network Security and Intrusion Detection
MethodsSparse Evolutionary Training
