Distributed Anomaly Detection in Edge Streams using Frequency based Sketch Datastructures
Prateek Chanda, Malay Bhattacharya

TL;DR
This paper introduces MDistrib, a distributed GPU-accelerated framework for real-time anomaly detection in large-scale network logs, offering faster detection, lower false positives, and higher accuracy using constant memory.
Contribution
The paper presents MDistrib, a novel distributed approach with collision-aware scoring that improves anomaly detection speed and accuracy over existing methods in large network data streams.
Findings
MDistrib detects anomalies faster than prior methods.
It achieves lower false positive rates with fixed memory.
The approach improves detection accuracy through collision-aware scoring.
Abstract
Often logs hosted in large data centers represent network traffic data over a long period of time. For instance, such network traffic data logged via a TCP dump packet sniffer (as considered in the 1998 DARPA intrusion attack) included network packets being transmitted between computers. While an online framework is necessary for detecting any anomalous or suspicious network activities like denial of service attacks or unauthorized usage in real time, often such large data centers log data over long periods of time (e.g., TCP dump) and hence an offline framework is much more suitable in such scenarios. Given a network log history of edges from a dynamic graph, how can we assign anomaly scores to individual edges indicating suspicious events with high accuracy using only constant memory and within limited time than state-of-the-art methods? We propose MDistrib and its variants which…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNetwork Security and Intrusion Detection · Anomaly Detection Techniques and Applications · Software System Performance and Reliability
